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High-interest clones 


Research into cloned human cells has left the spectre of past scientific fraud behind. But reaction to 


the earlier work still holds worthwhile lessons. 


hen, in 2004, Woo Suk Hwang claimed to have produceda 
W senss line derived from an embryonic human clone, his 

research, done at Seoul University, sparked intense interest 
and hype. Even though Hwang’s work later proved to be fraudulent, all 
advances in the field risk being measured against it. At the same time, 
researchers seek to distance themselves from the episode to the extent 
that its ethical implications for current work are rarely discussed. 

This week, scientists have come the closest of any so far in emulating 
Hwang’s claimed results: on page 70, researchers from the New York 
Stem Cell Foundation Laboratory report using cloning technology 
to reprogram human DNA taken from an adult and create embry- 
onic stem cells. But they do not use the term cloning to describe their 
results. That is one of many contrasts between the research landscape 
now and in 2004. 

Hwang’s claims received worldwide attention. Patient groups 
jumped for joy; scientists around the world used the results to gather 
more funds for stem-cell research; and bioethicists emerged to justify 
or condemn the work. Reaction this week is likely to be more muted. 

Discussion of the ethical concerns raised by such work have calmed, 
and the research group behind the latest study dealt with one of the 
most divisive issues — the retrieval of human eggs from donors — 
in a transparent and considered way. Hwang, by contrast, had pro- 
cured eggs unethically and illegally, a problem first brought to public 
attention in Nature (see Nature 429, 3; 2004). Whereas discussion 
of Hwang’s results featured the phrase ‘therapeutic cloning’ and so 
invited (sometimes wilful) confusion with reproductive cloning and 
the spectre of technology misuse, the latest paper refers only to the 
reprogramming of cells to a pluripotent state. A final issue — that 
embryos are destroyed in the process of the research — does still apply. 

The ultimate goal of such research is to create patient-specific stem 
cells for drug screening and the growth of genetically identical tis- 
sue for transplantation. Yet cloning, whether called that or not, is 
no longer the only means to this end, as it seemed in Hwang's time. 
Induced pluripotent stem (iPS) cells, first developed in 2006, now offer 
the same promise without the need for egg recruitment or embryo 
destruction: they are produced from adult cells by introducing a 
few genetic factors to the cell rather than using an entire egg. When 
therapeutic-cloning studies stalled on an egg shortage, iPS cell frenzy 
filled the gap. Competition between the approaches is fierce, and the 
authors of the current study point out the many weaknesses of iPS cells 
to bolster their own work. But their approach, too, has along way to go. 

The biggest reason that the results won't generate Hwang-like 
headlines is that they do not go as far. Hwang claimed to have created a 
cloned human embryo with the same 46 chromosomes as its parent, in 
avery similar way to howscientists have produced living cloned mam- 
mals. Hwang’s embryo would have been viable, generating huge ethical 
debate. His claimed results were so advanced that in 2005, Hwang was 
applying to start clinical trials. 


The cells presented this week have an 23 extra chromosomes from the 
egg. Hwang, like most researchers in the field, removed this DNA and 
used the egg merely to drive reprogramming; it didn’t work. The latest 
study left the egg DNA in, and says that some element of it is essential. 

The cells derived from this ‘triploid’ embryo show many of the 
functions of normal cells, but such embryos are not viable and it is 

not yet clear how triploid cells would mimic 


“Even though the behaviour of cells in tissue. No one will be 
Hwang’s work calling them clinically relevant any time soon. 
proved to be Still, iPS cell work is on the defensive, and 
fraudulent, all this study provides proof that human somatic 
advances in cells can be reprogrammed. 

the field risk Now, researchers have to prove that the 
being measured  workisa step towards a biomedically use- 


ful stem-cell line. The authors are confident 
that they can produce a stem-cell line from 
a ‘normal diploid cloned embryo, as Hwang claimed to do. They will 
have to work out what it is in the egg’s genetic material that is necessary 
for the reprogramming. 

The latest achievement points in the same direction as Hwang’s 
claims. If researchers were to find the magic element in the egg, not only 
would there again be excitement, but the old ethical issues would resur- 
face. Hype around potential procedures would increase the market for 
eggs, which is perhaps hard to justify. The embryos would be viable, no 
doubt again producing fears of self-cloning dictators. (For that reason, 
this might be a good time for the United Nations to hammer out cloning 
regulations or restrictions, which have been hamstrung by political and 
religious debate.) And desperate patients would find doctors ready to 
give them unproven and unsafe embryonic-stem-cell treatments. 

The results might look mundane. But the potential for reasoned 
excitement and irrational hype remain. = 


against it.” 


The games begin 


Frustrations of the newest European member 
states will shape debate over research funding. 


ith some €80 billion (US$105 billion) to distribute, the next 
Wise research funding programme will have one of the 
world’s most generous science budgets. The European Com- 
mission has promised radical change to the programme, called Horizon 
2020, and researchers, politicians and commentators have been waiting 


to see the results. This week, Nature reveals the programme’s new look. 
A leaked draft of the commission’s plans for Horizon 2020, 
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discussed on page 16, reveals the commission’s admirable and 
much-needed attempt to make applying for funding and participating 
in the programme a lot easier for researchers. 

It is true that the commission has attempted to streamline past pro- 
grammes, notably the current Seventh Framework Programme, which 
runs until 2013. But wider and deeper change is needed — particularly, 
as the commission now suggests, to harmonize the criteria for evaluat- 
ing research proposals and judging what counts as eligible research 
costs across the different sections of the programme. This means that 
researchers will need to learn only one set of rules, whether they apply 
for funding for collaborative research projects that tackle key societal 
challenges such as energy efficiency, or for grants from, for example, 
the Budapest-based European Institute of Innovation and Technology. 

The proposals also suggest the provision of better guidance for 
researchers who must fill out time sheets to satisfy the commission's 
demand for financial accountability once research projects are complete. 
Significantly, the plans abolish time recording for researchers who work 
full-time on one project, such as those with grants from the European 
Research Council, which funds frontier research across Europe. This 
comes as a welcome move, as there is little point in having all that money 
available if the rules of play are so complicated and time-consuming 
that they discourage researchers from applying. The cream of Europe's 
science crop have, in the past, turned their backs on the funding pro- 
gramme to compete for other, less-bureaucratic funding streams, threat- 
ening to push the Framework programme towards mediocrity. 

There is no guarantee that the revisions go far enough to halt this 
trend. And as the finer details of the programme are hammered out, 
including developing the annual calls for proposals, the commission 
should allow researchers more freedom to draw up research proposals, 
rather than continuing to prescribe the precise projects that it wants 
to fund. It is scientists, not Brussels bureaucrats, who are best placed 
to know what is new and interesting. 

Missing from the leaked proposals is a clear solution to the tension 
building among the 12 newest European member states — known 
as the EU 12, including Poland and Romania — which feel excluded 
by the drive to fund excellent science. With weaker national science 


and technology systems, researchers in these countries often lose out 
to their counterparts in the scientifically stronger nations, who are 
able to write better grant proposals. Researchers in the EU 12 com- 
plain that young research talent is not being given the support or the 
opportunity to show its potential. They are not alone in their concerns 
over the uneven geographical distribution of the programme's funds. 

Members of the European Parliament’s 


“Nature industry, research and energy committee 
applauds the said in a report on 31 August that they find it 
commission’s unacceptable that the lion’s share of research 
hard-fought funds goes to the richer member states. 

efforts to Traditionally, the commission allocates 
prioritize support for national capacity-building 
excellence as through a separate funding stream available 


in the European Union's budget, called struc- 
tural funds. The commission encourages 
their use for research purposes in the newer 
member states. But it has had mixed results, in part because govern- 
ments prefer to use the funds for improvements that their voters can 
see and use, such as new roads. It is not enough for the commission 
to claim that structural funds can help to put newer member states 
on the path towards excellent science, as it does in the Horizon 2020 
draft. Rather, it must propose concrete initiatives and reforms that 
encourage those governments to use these funds for research. A key 
starting point could be to cut the red tape around the use of struc- 
tural funds, which is even more difficult to navigate than the research 
Framework programme. 

Nature applauds the commission’s hard-fought efforts to 
prioritize excellence as a key funding criterion — specifically, its plan 
to devote one-third of the programme to excellent research. This focus 
will be ever more important if Europe is to compete on the global 
research stage. Nevertheless, the frustrations of the EU 12 countries 
need to be addressed, not least because they, along with members of 
the European Parliament, could delay agreement on the programme 
plans. This issue is likely to dominate much of the debate on the shape 
of European funding over the next 18 months. So, let the games begin. m 


akey funding 


criterion.” 


Back to the Futures 


As Nature’s science-fiction column reaches a 
milestone, we recall some of the highlights. 


Nature journals under the ‘Futures’ banner. The number 400 

is, of course, only significant to those of us with ten digits. It's 
more impressive in binary (110010000), although nothing special in 
Hex (190th), and the Octalonians of the Octillian system (our keenest 
readers) will mark it as their 620th. 

The number, however presented, includes all the stories we have 
published in Nature — on, off, simultaneously or instead of — since 
Arthur C. Clarke's inaugural salvo on 4 November 1999, as well as those 
featured in the completely separate time-stream of Nature Physics, 
a few parsecs away. 

Looking back at the Futures, as they say, we find that the column, 
while barely noticed by many, sitting as it does at the back of each 
printed issue (although free to all online), is a guilty pleasure for 
the discerning few. The anthology, Futures from Nature, was given 
a starred review by Publishers Weekly, and, in 2005, the column won 
Nature the accolade of ‘Best Science Fiction Publisher from the Euro- 
pean Science Fiction Society. (We ignore those wags who say that 
everything that Nature publishes is science fiction.) 

Among the canon of stories published in Futures are missives from 


[T= week sees the 400th science-fiction story published in 
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superluminaries of the genre: Michael Moorcock, Frederik Pohl and 
Ursula Le Guin. (Had Isaac Asimov been alive, hed probably have 
written the lot.) But there have also been tales from other established 
writers, perhaps less well known to Nature readers, and many more 
from scientists — and others — trying out fiction for the very first 
time. We've had lesbian robots from a senior citizen in Alaska, the 
shade of Michael Jackson from a virologist in Singapore, the prob- 
lems of copyrighting dreams from a software consultant in India, 
and intergalactic country music by a student from Malaysia. Futures 
was also the venue for the first story ever sold by high-school student 
Shelly Li of Omaha, Nebraska — who is now just about to publish 
her first novel. 

You, too, can join the throng by following the exploits of Futures 
on Facebook (go.nature.com/mtoodm) or by sending your story 
(850-950 words) to futures@nature.com. But beware, Futures has 
become a victim of its own success — like trying to nail jelly to the 
ceiling, it is now almost as hard to get a story into Futures as to havea 
research paper published in Nature. 

Futures, like radio signals from distant suns, will surely come and 
go. But as the man said (the ‘man, depending on which web page 
you read, being Yogi Berra, Niels Bohr, Woody Allen or, who knows, 
Donald Rumsfeld), prediction is very difficult, especially about 
the future. As such, we intend to keep Futures until we (or you) get 
bored of it, or until Earth is struck by an aster- 
oid, whichever comes first. The first seems 
unlikely. As for the second, we shall no doubt 
have other, more pressing concerns. Here's to 
the next 110010000. = 


SD NATURE.COM 

To comment online, 
click on Editorials at: 
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hen scientists wish to speak with one voice, they typically 
Ws: so in a most unscientific way: the consensus report. The 

idea is to condense the knowledge of many experts into a 
single point of view that can settle disputes and aid policy-making. 
But the process of achieving such a consensus often acts against these 
goals, and can undermine the very authority it seeks to project. 

My most recent engagement with this form of penance is marked this 
week with the release of Geoengineering: A National Strategic Plan for 
Research on Climate Remediation. Sponsored by the Bipartisan Policy 
Center in Washington DC, the report reflects more than a year of discus- 
sion between 18 experts from a diverse range of fields and organizations. 
It sets out, I think, many valuable principles and recommendations. 

The discussions that craft expert consensus, however, have more in 
common with politics than science. And I dont think give too much 
away by revealing that one of the battles in our 
panel was over the term geoengineering itself. 

This struggle is obvious in the report's title, 
which begins with ‘geoengineering’ and ends with 
the redundant term ‘climate remediation. Why? 
Some of the committee felt that ‘geoengineering’ 
was too imprecise; some thought it too controver- 
sial; others argued that it was already commonly 
used, and that a new term would create confusion. 

I didn't have a problem with ‘geoengineering, 
but for others it was a do-or-die issue. I yielded 
on that point (and several others) to gain political 
capital to secure issues that had a higher prior- 
ity for me. Thus, disagreements between panel- 
lists are settled not with the Tight’ answer, but by 
achieving a political balance across many of the 
issues discussed. 

This political essence of consensus leads to other difficulties. Ask a 
panel to address broad questions — future directions for a field, say, 
or ways to improve a government programme — and the recommen- 
dations that come back are typically bland and predictable. New and 
controversial ideas are inherently difficult for experts to agree on. In 
the absence of consensus, the default position is simply to call for more 
research — the one recommendation that most scientists can get behind. 

Sometimes, expert panels are asked to find consensus on narrow 
technical questions at the heart of public controversies. The hope 
is that a unified scientific voice will resolve the dispute, but it rarely 
works out that way. In 2000, the US National Academies assembled 
climate experts to resolve discrepancies in surface and satellite climate 
temperature records, as if this would help to settle the political debate. 
A decade on, it is clear that the goal was not met. 


And in 2009, at the height of the US debateon NATURE.COM 
health-care reform, the US Preventive Services _ Discuss this article 
Task Force released a consensus report on the _ online at: 
risks and benefits of mammograms. Ratherthan __go.lature.com/Solimyy 
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The voice of science: 
let’s agree to disagree 


Consensus reports are the bedrock of science-based policy-making. But 
disagreement and arguments are more useful, says Daniel Sarewitz. 


clarifying anything, the key recommendation — that mammograms 
were being overutilized — became instant ammunition for reform 
opponents, who viewed it as a threat to patient autonomy. 

The fuss over mistakes in the 2007 reports by the Intergovernmen- 
tal Panel on Climate Change highlights a related problem: a claim of 
scientific consensus creates a public expectation of infallibility that, 
if undermined, can erode public confidence. And when expert con- 
sensus changes, as it has on health issues from the safety of hormone 
replacement therapy to nutritional standards, public trust in expert 
advice is also undermined. 

The very idea that science best expresses its authority through 
consensus statements is at odds with a vibrant scientific enterprise. 
Consensus is for textbooks; real science depends for its progress on 
continual challenges to the current state of always-imperfect knowl- 
edge. Science would provide better value to 
politics if it articulated the broadest set of plau- 
sible interpretations, options and perspectives, 
imagined by the best experts, rather than forcing 
convergence to an allegedly unified voice. 

Yet, as anyone who has served on a consensus 
committee knows, much of what is most inter- 
esting about a subject gets left out of the final 
report. For months, our geoengineering group 
argued about almost every issue conceivably 
related to establishing a research programme. 
Many ideas failed to make the report — not 
because they were wrong or unimportant, but 
because they didn’t attract a political constitu- 
ency in the group that was strong enough to 
keep them in. The commitment to consensus 
therefore comes at a high price: the elimination 
of proposals and alternatives that might be valuable for decision- 
makers dealing with complex problems. 

Some consensus reports do include dissenting views, but these are 
usually relegated to a section at the back of the report, as if regretfully 
announcing the marginalized views of one or two malcontents. Science 
might instead borrow a lesson from the legal system. When the US 
Supreme Court issues a split decision, it presents dissenting opinions 
with as much force and rigour as the majority position. Judges vote 
openly and sign their opinions, so it is clear who believes what, and why 
— a transparency absent from expert consensus documents. Unlike 
a pallid consensus, a vigorous disagreement between experts would 
provide decision-makers with well-reasoned alternatives that inform 
and enrich discussions as a controversy evolves, keeping ideas in play 
and options open. That is something on which we should all agree. m 


Daniel Sarewitz is co-director of the Consortium for Science, Policy and 
Outcomes at Arizona State University, and is based in Washington DC. 
e-mail: daniel.sarewitz@gmail.com 


6 OCTOBER 2011 | VOL 478 | NATURE |7 


© 2011 Macmillan Publishers Limited. All rights reserved 


SEVEN DAYS nscesins 


NiH-budget duel 

A bill released on 

29 September by the spending 
committee of the US House 
of Representatives would 
boost the budget of the 
National Institutes of Health 
(NIH) by US$1 billion, or 
3.3%, in 2012. That puts it in 
direct conflict with a Senate 
bill cutting the agency’s 
budget by $190 million, to 
$30.5 billion, next year. The 
two bills must be resolved in 
the coming weeks. The Senate 
bill also establishes and funds 
a proposed translational 
medicine centre at the NIH, 
but the House bill does not 
mention it. 


Energy priorities 
The US Department of 
Energy released its inaugural 
Quadrennial Technology 
Review on 27 September, 
laying out a multi-year agenda 
that sets priorities in six areas, 
including vehicle efficiency, 
alternative hydrocarbon fuels 
and cleaner electricity. It is 
modelled on the Quadrennial 
Defense Review, an analysis 
that sets the tone and direction 
of US defence policy. The 

first energy-technology 
review made no radical 
recommendations, mostly 
keeping to the agency’s current 
course. See go.nature.com/ 
uyabvv for more. 


SPICE onice 


A UK experiment to 

test climate-engineering 
technology by spraying water 
from a balloon 1 kilometre 
above Earth was put on 

hold this week — owing 
partly to protests from 
environmentalists. The 
Stratospheric Particle Injection 
for Climate Engineering 
(SPICE) project aims to trial 
technology for spraying 


Farewell to the Tevatron 


After more than 25 years of colliding particles, 
the massive Tevatron accelerator at Fermilab 
in Batavia, Illinois, was turned off for good on 
30 September. The collider helped to confirm 
the standard model of physics — it was where 


sulphate aerosols at heights 
of up to 25 kilometres, to cool 
Earth by reflecting sunlight. 
Protesters said that the test 
would violate a decision by 
the Convention on Biological 
Diversity not to undertake 
large-scale geoengineering 
experiments. SPICE scientists 
say that the halt followed a 
consultation which raised 
concerns that there had not 
been enough engagement 
with environmental groups. 
See go.nature.com/hmuljg 
for more. 


Jackson expands 


The Jackson Laboratory, 

a medical research centre 
based in Bar Harbor, Maine, 
is hoping to set up a major 
satellite facility for personal 
genomics research at the 
University of Connecticut 
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Health Center in Farmington. 
The lab had previously tried 
for more than a year to site its 
facility in Florida, but had to 
pull out in June (see Nature 
474, 133; 2011) after Florida 
politicians said that the state 
could not afford to invest in it. 
But Connecticut is prepared 
to contribute US$291 million 
towards the $1.1-billion lab, 
governor Dannel Malloy 
announced on 30 September. 
The deal depends on 
agreement from the state’s 
legislature. 


Mouse phenome 

An ambitious effort to work 
out the function of all of 

the approximately 21,000 
protein-coding genes in the 
mouse genome has the funds 
it requires for the initial phase 
of its mission. After a meeting 
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the top quark was found in 1995 — and it spent 
its final years restricting the possible mass range 
of the Higgs boson. Scientists are still analysing 
those data, and Fermilab is shifting to smaller- 
scale experiments (see Nature 477, 379; 2011). 


in Washington DC last week, 
the International Mouse 
Phenotyping Consortium, 
which was launched last year 
(see Nature 465, 410; 2010), 
announced that research 
agencies in eight countries will 
analyse 5,000 mouse genes by 
2016, with part of the work 
being financed by the US 
National Institutes of Health. 
The work involves inactivating 
a particular gene and then 
investigating how the mouse’s 
characteristics change. 


Global eco-network 
Plans for a global network to 
monitor agricultural areas 
came a step closer to being 
realized last week, as scientists, 
philanthropic organizations 
and big businesses met at 
Columbia University in New 
York to discuss the idea. 
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Researchers described pilot 
sites in Africa that collect data 
on soils, nutrients and land 
cover, but also track changing 
agricultural practices and 
socioeconomic trends. Jeffrey 
Sachs, director of Columbia’ 
Earth Institute, said he hoped 
that — with industrial funding 
— the network might grow to 
some 500 sites in two or three 
years. See go.nature.com/ 
yu5qdt for more. 


Graphene cash 


The UK government is 
investing £50 million 

(US$78 million) to create a 
research and technology hub 
focusing on graphene, the one- 
atom-thick sheets of carbon for 
which physicists Andre Geim 
and Konstantin Novoselov, 
based at the University of 
Manchester, UK, won the 2010 
Nobel Prize in Physics. British 
scientists — still smarting 
from broad funding cuts 

— welcomed the 3 October 
announcement, which also 
included £145 million to 

build infrastructure for high- 
performance computing. 

See go.nature.com/7pzaio 

for more. 


China space lab 


Tiangong-1, a test module 
for China's space station, was 
launched on 29 September 
(pictured). The 10.4-metre 
cylinder is orbiting alone and 
unmanned for now, but ina 
few weeks time China will 


NASA said on 29 September 
that it had spotted 911 of the 
estimated 981 near-Earth 


objects larger than 1 kilometre 


across, using infrared data to 
recalculate the relationship 


between asteroid reflectivity and 


size. The agency has now met 


the US Congress's 1998 mandate 
to find more than 90% of ‘killer’ 


asteroids. A 2005 mandate 


extended that to asteroids down 


to 140 metres across; so far, 


NASA thinks it has found 35% 


of the estimated 13,200 such 
objects. 


try to dock it in orbit with an 
unmanned spacecraft. Two 
missions carrying Chinese 
astronauts will follow in 2012, 
with more test modules over 
the subsequent three years. 

If all goes to plan, China will 
launch further modules to be 
assembled into a space station 
by 2020 (see Nature 473, 
14-15; 2011). 


HIV prevention 


Hopes of preventing HIV by 
giving drugs to uninfected 
women were dented last 

week when part ofa large 
clinical trial testing the idea 
was halted. The Microbicide 
Trials Network said that it will 
no longer give out tenofovir 
antiretroviral tablets in the 
VOICE study — involving 
5,029 women across South 
Africa, Zimbabwe and Uganda 
— after an independent 
monitoring board found that, 
on the basis of results so far, 


it wasn't possible to show 
that the tablets were better 
than placebo. Other arms of 
the trial, involving a gel and 
another antiretroviral tablet, 
continue. Previous trials have 
yielded contradictory results 
on whether the practice works 
for women, although it has 
been shown to reduce HIV 
risk for men. 


| BUSINESS 
Amazon dam halted 


Construction of the Belo 
Monte Dam — a massive 
11.2-gigawatt hydroelectric 
plant ona tributary of the 
Amazon in Brazil — has for 
the second time this year 

been put on hold by a federal 
judge, who ruled that it could 
damage fish stocks. The dam, 
on the Xingu River in the 

state of Para, is opposed by 
environmental campaigners 
and indigenous people, but the 
government is strongly behind 
it, and the environment agency 
IBAMA has already granted 
aconstruction licence. The 
Norte Energia consortium of 
companies building the dam 
will appeal against the ruling. 


US science medals 


Cloning pioneer Rudolf 
Jaenisch, a biologist at the 
Whitehead Institute in 
Cambridge, Massachusetts, 
is one of seven researchers to 


| NASA REACHES ASTEROID-SPOTTING GOAL 


The agency has found 90% of the asteroids larger than 1 kilometre 
near Earth, plus thousands that are as small as a few metres across. 
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9-12 OCTOBER 
The Geological 
Society of America 
meets in Minneapolis, 
Minnesota. 
go.nature.com/5vvixp 


10-21 OCTOBER 
Delegates from 194 
countries will gather 
in Changwon, South 
Korea, to discuss 
progress and targets to 
tackle desertification 
at the 10th meeting 

of the United Nations 
Convention to Combat 
Desertification. 
go.nature.com/8zq4jl 


11-15 OCTOBER 
The 12th International 
Congress of Human 
Genetics takes place in 
Montreal, Canada. 
www.ichg2011.org 


be awarded the US National 
Medal of Science this year. The 
White House announced the 
list of recipients — along with 
five awardees of the National 
Medal of Technology and 
Innovation — on 27 September. 
The medals are the highest 
honours that the United States 
bestows on its scientists and 
engineers. See go.nature.com/ 
j78ggo for more. 


Nobel prizes 


This year’s Nobel Prize in 
Physiology or Medicine 

went to Bruce Beutler, Jules 
Hoffman and Ralph Steinman, 
for their work on the immune 
system. The physics prize was 
won by Saul Perlmutter, Brian 
Schmidt and Adam Riess, for 
discovering the accelerating 
expansion of the Universe by 
observing distant supernovae. 
See pages 13 and 14 for more. 
Nature went to press before the 
chemistry prize was awarded, 
but full details will be available 
at go.nature.com/uio77d. 


> NATURE.COM 
For daily news updates see: 
Www.nature,com/news 
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Immunity service: (from left) Jules Hoffmann, Bruce Beutler and Ralph Steinman share this year’s Nobel Prize in Physiology or Medicine. 


NOBEL PRIZE 


Nobel announcement 
marred by winner’s death 


Immunology takes prize for medicine, but award comes three days too late for one recipient. 


BY EWEN CALLAWAY 


efore the immune system can attack an 
B invading pathogen, it must identify the 

intruder. Breakthroughs in understand- 
ing this process have garnered three scientists 
this year’s Nobel Prize in Physiology or Medi- 
cine, which was announced on 3 October in 
Stockholm. 

But the award was quickly overshadowed by 
sadness when it emerged that the winner ofhalf 
the 10-million-Swedish-krona (US$1.5-mil- 
lion) prize, Ralph Steinman of the Rockefeller 
University in New York, had died on 30 Sep- 
tember. Although Nobel prizes are not awarded 
posthumously, the Nobel committee was not 
aware of Steinman’s death when it reached its 
decision. The committee has since confirmed 
that the award still stands. Ironically, Steinman 
was being treated for his pancreatic cancer with 
a therapy derived from his original discovery. 

Together, his work and that of the other 


two winners — Jules Hoffmann at the French 
National Centre for Scientific Research 
(CNRS) Institute of Cell and Molecular Biol- 
ogy in Strasbourg and Bruce Beutler of the 
Scripps Research Institute in La Jolla, Califor- 
nia — help to describe how two separate arms 
of the immune system work. 

Steinman discovered a type of immune cell, 
known as a dendritic cell, that is vital to the 
‘adaptive’ immune system, which works out 
exactly which pathogen has invaded the body 
in order to trigger a targeted response. Hoff- 
mann and Beutler earned their share of the 
prize for discovering a key to a more immedi- 
ate line of defence, the ‘innate’ immune system, 
which identifies a foreign body as a potential 
pathogen. They identified the molecular sen- 
tinels that first sound the alarm by recognizing 
features shared by numerous pathogens. 

Steinman’s efforts to understand the 
immune system began in the early 1970s, 
when he joined the laboratory of Zanvil Cohn 


at Rockefeller as a postdoc. Cohn’s group 
was studying an immune cell called the mac- 
rophage, which engulfs pathogens and other 
debris. Most researchers thought that mac- 
rophages then alerted adaptive immune cells 
called T cells to the presence of a specific 
pathogen. Once activated, T cells multiply and 
combat infection, either by killing pathogen- 
infected cells or by steering another type of 
immune cell, the B cell, to produce pathogen- 
blocking antibodies. 

In Cohn’s lab, Steinman identified another 
type of immune cell, which he named the den- 
dritic cell because of its long, tree-like arms!. 
Cohn and Steinman showed that these cells are 
much more important than macrophages in 
activating T cells. 

At first, dendritic cells “were a minor cell 
and everybody was loath to accept them’, 
recalls Siamon Gordon, an immunologist at 
the University of Oxford, UK, who worked 
with Cohn and Steinman. “It was a bit like 
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> having two Popes — it was the dendritic 
cells versus the macrophages.” Steinman 
continued doggedly collecting data, and 
eventually won over his critics. 

Two decades after the discovery of 
dendritic cells’ crucial role, a team led by 
Hoffmann was investigating why fruitflies, 
which lack an adaptive immune system, 
don’t succumb to fungal infection. In 1996, 
they reported that the Toll gene, previously 
linked to embryo development, was also 
important for battling infections”. Flies 
with mutations in Toll died when exposed 
to bacteria or fungi. 

At around the same time, a team led by 
Beutler, then at the University of Texas 
Southwestern Medical Center in Dallas, had 
spent six years looking for an immune-sys- 
tem gene in mice that produces a protein to 
recognize lipopolysaccharide (LPS), a mol- 
ecule produced by certain bacteria that can 
cause septic shock. “We were obsessed,’ says 
Alexander Poltorak, an immunologist now 
at Tufts University in Boston, Massachusetts, 
who worked on the project. “We always 
thought we would find the gene tomorrow.” 

The team eventually found its LPS- 
sensing gene, and it looked remarkably like 
Hoffmann’ Toll’. Linking the two findings 
paved the way for the discovery of other Toll- 
like receptors that sense molecules made by 
pathogens but not their hosts, and form a 
critical part of the innate immune system. 

The discoveries of dendritic cells and 
innate immune receptors have already 
had an impact on medicine. Vaccines are 
typically administered with an adjuvant, 
such asa metal, to prompt a rapid immune 
response. Drug companies such as Glaxo- 
SmithKline are now developing adjuvants 
that activate Toll-like receptors. 

“By doing this we are mimicking what 
actually happens during an infection with- 
out having an infection,” says Vincenzo 
Cerundolo, associate director of the UK 
Medical Research Council Immunology 
Unit in Oxford. 

Meanwhile, Provenge (made by the bio- 
technology company Dendreon of Seattle, 
Washington), the only cellular immune 
therapy against cancer to be approved by 
the US Food and Drug Administration, 
exploits dendritic cells that recognize a 
molecule produced by prostate tumours. 
Culturing and reinjecting the cells back into 
the patient fortifies the immune response 
against the tumour. 

“The reason why the field has progressed 
so much and is now in the clinic is because 
we understand how to activate the immune 
system,’ says Cerundolo. m 


1. Steinman, R. M. & Cohn, Z. A.J. Exp. Med. 137, 
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2. Lemaitre, B. et al. Cell 86, 973-983 (1996). 
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NOBEL PRIZE 


BY GEOFF BRUMFIEL 


r | WVhree astrophysicists have been awarded 
a Nobel prize for planting a perplexing 
puzzle at the heart of cosmology. 

Half of the Nobel Prize in Physics goes 
to Saul Perlmutter of Lawrence Berkeley 
National Laboratory in California for leading 
a team that discovered that the Universe is 
expanding at an ever-increasing rate (S. Perl- 
mutter et al. Astrophys. J. 517, 565-586; 1999). 
Brian Schmidt of the Australian National 
University in Weston Creek and Adam Riess 
of the Space Telescope Science Institute in Bal- 
timore, Maryland, share the other half of the 
prize for independent measurements of the 
cosmic acceleration (A. G. Riess et al. Astron. J. 
116, 1009-1038; 1998), which researchers have 
struggled to explain ever since. 

“T feel kind of weak in the knees,” Schmidt 
told reporters in Sweden via telephone. “It sort 
of feels like when my children were born.” 

All three scientists reached their conclusions 
on the basis of measurements of distant Type Ia 
supernovae. These occur in very specific types 
of binary star system, in which a white dwarf 
star tears matter away from its partner until it 
gains enough mass to explode. At their peak, 
Type Ia supernovae always emit roughly the 
same amount of light, making them useful as 
‘standard candles’ by which to measure vast 
distances across the cosmos. 

In the late 1980s and early 1990s, the prize- 
winners precisely measured the brightness of 
these supernovae using newly developed digi- 
tal sensors. They then compared the brightness 
to the redshift — the change in colour of the 
light that results from the motion of the super- 
novae away from us. Both teams found that the 
supernovae were dimmer than expected at the 
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measured redshift. The inescapable conclusion 
was that the Universe was not only expand- 
ing — which astronomers first realized in the 
1920s — but expanding faster and faster. 

Schmidt says that the finding was initially 
“pretty perplexing”. Most astronomers had 
expected that the Universe’s rapid growth 
following the Big Bang would gradually slow 
down as gravity pulled distant galaxies towards 
each other. Yet the discovery was accepted 
almost immediately by the astronomical com- 
munity — in part because the idea of a cosmic 
pressure pushing the Universe outwards had 
already been mooted by Albert Einstein. 

When Einstein applied his general theory of 
relativity to the Universe as a whole in 1917, his 
equations included a ‘cosmological constant’ 
which described just such an outward force. 
Over the past decade, observations of the large- 
scale structure of the Universe, together with the 
cosmic microwave background radiation — the 
faint afterglow of the Big Bang — have also indi- 
cated that the majority of the Universe’s energy 
remains undetected. Today, the astronomical 
community accepts that about 73% of the Uni- 
verse’s energy is invested in this cosmic accelera- 
tion. Known as dark energy, it remains a largely 
mysterious force. “Nobody really knows what it 
is that has been discovered,’ says Peter Coles, an 
astrophysicist at Cardiff University, UK. 

The predominant view is that dark energy 
results from quantum fluctuations in the 
vacuum of space, but efforts to use quantum 
theory to describe it have so far failed. Other 
theories, including modifications of gravity, 
have gained little acceptance. 

“It could be none of the above,” Coles 
says. But “we wouldn't be on the trail of this 
‘none-of-the-above if it hadn't been for these 
experiments”. = 


REDUX/EYEVINE; AAS/ID PHOTOGRAPHICS/AAP 


IMAGE; JOHNS HOPKINS UNIV. 


| NEWS IN FOCUS 


> having two Popes — it was the dendritic 
cells versus the macrophages.” Steinman 
continued doggedly collecting data, and 
eventually won over his critics. 

Two decades after the discovery of 
dendritic cells’ crucial role, a team led by 
Hoffmann was investigating why fruitflies, 
which lack an adaptive immune system, 
don’t succumb to fungal infection. In 1996, 
they reported that the Toll gene, previously 
linked to embryo development, was also 
important for battling infections”. Flies 
with mutations in Toll died when exposed 
to bacteria or fungi. 

At around the same time, a team led by 
Beutler, then at the University of Texas 
Southwestern Medical Center in Dallas, had 
spent six years looking for an immune-sys- 
tem gene in mice that produces a protein to 
recognize lipopolysaccharide (LPS), a mol- 
ecule produced by certain bacteria that can 
cause septic shock. “We were obsessed,’ says 
Alexander Poltorak, an immunologist now 
at Tufts University in Boston, Massachusetts, 
who worked on the project. “We always 
thought we would find the gene tomorrow.” 

The team eventually found its LPS- 
sensing gene, and it looked remarkably like 
Hoffmann’ Toll’. Linking the two findings 
paved the way for the discovery of other Toll- 
like receptors that sense molecules made by 
pathogens but not their hosts, and form a 
critical part of the innate immune system. 

The discoveries of dendritic cells and 
innate immune receptors have already 
had an impact on medicine. Vaccines are 
typically administered with an adjuvant, 
such asa metal, to prompt a rapid immune 
response. Drug companies such as Glaxo- 
SmithKline are now developing adjuvants 
that activate Toll-like receptors. 

“By doing this we are mimicking what 
actually happens during an infection with- 
out having an infection,” says Vincenzo 
Cerundolo, associate director of the UK 
Medical Research Council Immunology 
Unit in Oxford. 

Meanwhile, Provenge (made by the bio- 
technology company Dendreon of Seattle, 
Washington), the only cellular immune 
therapy against cancer to be approved by 
the US Food and Drug Administration, 
exploits dendritic cells that recognize a 
molecule produced by prostate tumours. 
Culturing and reinjecting the cells back into 
the patient fortifies the immune response 
against the tumour. 

“The reason why the field has progressed 
so much and is now in the clinic is because 
we understand how to activate the immune 
system,’ says Cerundolo. m 


1. Steinman, R. M. & Cohn, Z. A.J. Exp. Med. 137, 
1142-1162 (1973). 

2. Lemaitre, B. et al. Cell 86, 973-983 (1996). 

3. Poltorak, A. et. al. Science 282, 2085-2088 
(1998). 


14 | NATURE | VOL 478 | 6 OCTOBER 2 


NOBEL PRIZE 


BY GEOFF BRUMFIEL 


r | WVhree astrophysicists have been awarded 
a Nobel prize for planting a perplexing 
puzzle at the heart of cosmology. 

Half of the Nobel Prize in Physics goes 
to Saul Perlmutter of Lawrence Berkeley 
National Laboratory in California for leading 
a team that discovered that the Universe is 
expanding at an ever-increasing rate (S. Perl- 
mutter et al. Astrophys. J. 517, 565-586; 1999). 
Brian Schmidt of the Australian National 
University in Weston Creek and Adam Riess 
of the Space Telescope Science Institute in Bal- 
timore, Maryland, share the other half of the 
prize for independent measurements of the 
cosmic acceleration (A. G. Riess et al. Astron. J. 
116, 1009-1038; 1998), which researchers have 
struggled to explain ever since. 

“T feel kind of weak in the knees,” Schmidt 
told reporters in Sweden via telephone. “It sort 
of feels like when my children were born.” 

All three scientists reached their conclusions 
on the basis of measurements of distant Type Ia 
supernovae. These occur in very specific types 
of binary star system, in which a white dwarf 
star tears matter away from its partner until it 
gains enough mass to explode. At their peak, 
Type Ia supernovae always emit roughly the 
same amount of light, making them useful as 
‘standard candles’ by which to measure vast 
distances across the cosmos. 

In the late 1980s and early 1990s, the prize- 
winners precisely measured the brightness of 
these supernovae using newly developed digi- 
tal sensors. They then compared the brightness 
to the redshift — the change in colour of the 
light that results from the motion of the super- 
novae away from us. Both teams found that the 
supernovae were dimmer than expected at the 


O11 


© 2011 Macmillan Publishers Limited. All rights reserved 


Shining brightly: (from left) Saul Perlmutter, Brian Schmidt and Adam Riess. 


Stellar performance 
nets physics prize 


Nobel for supernovae signals of accelerating Universe. 


measured redshift. The inescapable conclusion 
was that the Universe was not only expand- 
ing — which astronomers first realized in the 
1920s — but expanding faster and faster. 

Schmidt says that the finding was initially 
“pretty perplexing”. Most astronomers had 
expected that the Universe’s rapid growth 
following the Big Bang would gradually slow 
down as gravity pulled distant galaxies towards 
each other. Yet the discovery was accepted 
almost immediately by the astronomical com- 
munity — in part because the idea of a cosmic 
pressure pushing the Universe outwards had 
already been mooted by Albert Einstein. 

When Einstein applied his general theory of 
relativity to the Universe as a whole in 1917, his 
equations included a ‘cosmological constant’ 
which described just such an outward force. 
Over the past decade, observations of the large- 
scale structure of the Universe, together with the 
cosmic microwave background radiation — the 
faint afterglow of the Big Bang — have also indi- 
cated that the majority of the Universe’s energy 
remains undetected. Today, the astronomical 
community accepts that about 73% of the Uni- 
verse’s energy is invested in this cosmic accelera- 
tion. Known as dark energy, it remains a largely 
mysterious force. “Nobody really knows what it 
is that has been discovered,’ says Peter Coles, an 
astrophysicist at Cardiff University, UK. 

The predominant view is that dark energy 
results from quantum fluctuations in the 
vacuum of space, but efforts to use quantum 
theory to describe it have so far failed. Other 
theories, including modifications of gravity, 
have gained little acceptance. 

“It could be none of the above,” Coles 
says. But “we wouldn't be on the trail of this 
‘none-of-the-above if it hadn't been for these 
experiments”. = 


REDUX/EYEVINE; AAS/ID PHOTOGRAPHICS/AAP 


IMAGE; JOHNS HOPKINS UNIV. 


MENTAL HEALTH 


IN FOCUS | NEWS 


Trillion-dollar brain drain 


Enormous costs of mental health problems in Europe not matched by research investment. 


BY KERRI SMITH 


rain disorders cost Europe almost 
Be billion (US$1 trillion) a year — 

more than cancer, cardiovascular disease 
and diabetes put together. That’s the conclusion 
of a report! commissioned by the Euro- 
pean Brain Council that provides the most 
comprehensive assessment of the financial 
consequences of mental ailments so far. 

The report's authors argue that these enor- 
mous costs — which exceed the entire gross 
domestic product of the Netherlands — mean 
that research into brain disorders receives dis- 
proportionately little funding compared with 
other diseases. They call on politicians and 
funders to step up support for basic research 
on these conditions, which are so costly 
because they often require long-term care 
and erode the productivity of those affected 
for years or decades. 

The report is an update of a similar survey 
in 2005, which found that brain disorders 
were costing Europe €386 billion’. Since then, 
Bulgaria and Romania have joined the Euro- 
pean Union and seven more categories of 
disorder have been added to the assessment, 
including eating disorders, sleep disorders, 
mental retardation, and childhood and develop- 
mental disorders such as autism. The authors 
say that their new estimate, although double the 
2005 figure, is likely to be “very conservative”. 

Mood disorders top the cost estimates, 
consuming €113.4 billion a year, following 
closely by dementia, at €105.2 billion (see 
‘Heavy burden’). On average, the annual cost 
per citizen is €1,550, with Luxembourg and 
the United Kingdom spending the most per 
head of population. 

Drugs, visits to doctors and hospitaliza- 
tions — the direct health-care costs — make 
up 37% of the bill. A further 23% is spent on 
direct non-medical costs, including informal 
care, social services and nursing homes. The 
remainder (40%) is sucked away by indirect 
costs, such as lost productivity as a result of 
time off work or early retirement. One reason 
for the high indirect costs is that “people don’t 
tend to die quickly from brain disorders’, says 
Jes Olesen, the neurologist at the University 
of Copenhagen who headed the survey team. 
“People live for years in a disabled condition” 

More than 100 scien- 
tists and health econo- 
mists in Europe were 
involved in collecting 
data for the report. For 
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HEAVY BURDEN 


Six categories of illness account for more than 
half of the costs of brain disorders in Europe. 
Indirect costs — such as working time lost to 
illness — are responsible for about 40% of the 
total financial burden. 


-e65.7bn 
e744 bn 


105.2 bn 
TOTAL COST 
(2010) 
€797.7 


BILLION 


0939 


y 


v 


v 


‘Other’ includes brain tumour, child/adolescent disorders, 
eating disorders, epilepsy, mental retardation, multiple 
sclerosis, neuromuscular disorders, Parkinson's disease, 
personality disorders, sleep disorders, somatoform 
disorder, stroke, traumatic brain injury. 


each country, the team found out how many 
people had a particular condition, estimated 
the financial costs, and then calculated 
Europe-wide figures. Where data weren't 
available, the prevalence and costs of disor- 
ders were estimated from figures from other 
countries. “They are of course imputations, 
but they are the best available,” says Olesen. 

No directly comparable reports exist 
elsewhere in the world, but several studies 
have looked at the costs of individual condi- 
tions, such as bipolar disorder, attention deficit 
hyperactivity disorder and schizophrenia, in 
both Europe and the United States. Overall, 
health-care costs per person are similar in 
both regions, but the direct costs — doctors 
and drugs — are higher in America. 

The drug industry, however, is increasingly 
shying away from these disorders. “The basic 
science is such that it’s quite difficult to iden- 
tify a new target, so you start with your hands 
tied behind your back,” says Patrick Vallance, 
head of medicines discovery and develop- 
ment for London-based drug giant GlaxoS- 
mithKline, which last year stopped funding 
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drug-development programmes in psychiatry, 
pain and cognitive neuroscience. Vallance also 
cites problems with unrealistic animal models, 
unpredictable results from early trials and dif- 
ficulties in diagnosing and allocating patients 
to trials. “At every stage of the process your risk 
is very much higher” for brain disorders than 
for other conditions, he says. 

A report produced by the European Brain 
Council in 2006 (ref. 3) estimated that Europe 
spent about the same amount on brain research 
as on cancer research (about €4 billion each), 
despite the much higher cost of brain disorders. 

Olesen says that the report presents clear 
evidence that greater scientific effort is 
required to tackle brain disorders. “The only 
way is to increase research and understand 
these disorders better,” says Olesen. Focusing 
on preventing these disorders in the first place 
would have the greatest cost benefit, he adds. m 
1. Gustavsson, A. et al. Eur. Neuropsychopharmacol. 

21, 718-779 (2011). 

2. Andlin-Sobocki, P. et a/. (eds) Eur. J. Neurol. 12, 
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3. Sobocki, P. et al. Eur. J. Neurosci. 24, 2691-2693 

(2006). 


6 OCTOBER 2011 | VOL 478 | NATURE | 15 


© 2011 Macmillan Publishers Limited. All rights reserved 


SOURCE: REF. 1 


| NEWS IN FOCUS 


Europe cuts funding red tape 


Changes to €80- billion Horizon 2020 research programme simplify grant process. 


BY NATASHA GILBERT 


hoped for. But Nature has learned that the 

next iteration of the European Union’s 
multibillion-euro research programme 
includes modest yet significant reforms that 
could make the vast pot of funds easier for 
researchers to tap. 

The European Commission acknowledges 
that burdensome administration, includ- 
ing a plethora of time sheets, financial audits 
and complicated rules, has discouraged some 
researchers from taking part in its €50.5-billion 
(US$67.3-billion) Seventh Framework Pro- 
gramme, known as FP7. Plans for the next 
cycle of funding in 2014-20, called Horizon 
2020, are not due for publication until late 
November, and although the commission 
has requested €80 billion for the programme, 
a budget has not yet been agreed. But Nature’s 
early look at the plans reveals that Horizon 
2020 will come with significantly less red tape. 

The plans say that researchers will have to 
navigate only one set of rules for all initiatives 
in the programme, including collaborative 
research projects and those that fall under the 
auspices of the European Institute of Innova- 
tion and Technology, headquartered in Buda- 
pest. The rules will cover how proposals are 
evaluated, the criteria used to award fund- 
ing and what indirect costs can be claimed 
for projects. Some variation in the rules will 
be allowed for issues such as the exploitation 
of research results and intellectual-property 
rights, in order to help innovation flourish — 
one of the key goals of Europe’s commissioner 
for research, innovation and science, Maire 
Geoghegan-Quinn. “A common set of basic 
principles” across all the initiatives would lead 
to “considerable trimming and lightening of 
rules’, the commission's proposals say. 

“Any move to simplify the application and 
management of grants is to be welcomed,” says 
Adam Hurlstone, a cancer researcher at the 
University of Manchester, UK. Hurlstone this 
year won a grant from the European Research 
Council (ERC), the pan-European initiative 
to fund frontier research solely on the basis of 
excellence, which has been lauded by research- 
ers for its relatively simple application process. 
He says that he has previously been put off from 
applying for grants from other initiatives in the 
framework programme “due to my perception 
of the organizing and administrative burden”. 

The plans also say that the framework 
programme will be reorganized to group all 


I tis not the revolution that some researchers 


Ernst-Ludwig Winnacker (left) and European research commissioner Maire Geoghegan-Quinn have 
helped to reshape the Framework research funding programme. 


‘excellence’ initiatives together under one 
administrative umbrella. This will include the 
ERC grants and the popular Marie Curie fel- 
lowships, which fund researchers to move from 
one EU country to another to work on a pro- 
ject. Unlike previous programmes, a portion of 
Horizon 2020 funding will be targeted towards 
six key challenges affecting society, including 

health, food security and clean energy. 
Researchers should also find it easier to claim 
back the indirect costs of research projects, 
such as the provision 


“Any move to of laboratory space 
simplify the and maintenance of 
applicationand — equipment. The com- 
management of mission will sweep 
grants is to be away the various ways 
welcomed.” of calculating indirect 


costs; instead, it pro- 
poses using a single rate for participants from 
institutions without commission-approved 
accounting systems. The move should help 
to avoid “frequent errors” in accounting, the 
commission says. 

But the commission has decided not to 
eliminate the requirement for researchers to 
report in detail their research costs, meaning 
that they will still have to fill out time sheets 
and undergo financial auditing when projects 
are completed. It rejected an option in which 
researchers would be awarded all of their fund- 
ing in a lump sum based on agreed research 
outputs. Instead, it settled for providing clearer 
advice to researchers filling in time sheets, and 
abolished time recording for researchers who 
work full time on a project, such as those with 
ERC grants. 
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The planning document says that more radical 
reforms would have required “major organi- 
sation changes in the commission’, including 
building up new skills and reassigning staff to 
different roles. “I would like to have seen a more 
radical approach,’ says a senior European sci- 
ence adviser, who requested anonymity. “It is 
clear the commission is concerned for its own 
jobs rather than solving the problems that exist.” 

The Horizon 2020 proposals also rejected 
calls to give the ERC more independence and 
reduce the administrative burden imposed 
by the commission (see Nature http://dx.doi. 
org/10.1038/news.2010.615; 2010). Instead, the 
ERC will remain partially under the adminis- 
trative control of the commission, as recom- 
mended in July by a task force including Helga 
Nowotny, the ERC president, and Ernst-Ludwig 
Winnacker, former secretary-general of the 
ERC. “In the past I would have recommended 
radical change, but we came to the conclusion 
that it would be too difficult and could threaten 
the ERC,’ says Winnacker. 

The plans show that the commission is mov- 
ing forward with other reforms suggested by 
the task force. These include replacing the 
commission’ stifling daily supervision of the 
ERC with regular, but less frequent, meetings 
between the ERC’s scientific leadership and a 
commission representative. 

The commission will now refine the proposals 
and assign budgets before they are officially 
published. Discussions between the com- 
mission, member states and the European 
Parliament will take about 18 months. They 
hope to agree on a final version in time for the 
programme to begin in 2014. mSEEEDITORIALP.S 
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Secrets of the human 
genome disclosed 


Meeting debates ethics of revealing genetic findings. 


BY ERIKA CHECK HAYDEN IN NEW YORK 


See: people be told about any nasty 


surprises that scientists discover in their 
DNA during research projects? 

The question is becoming increasingly 
pertinent, as thousands of people sign up 
for studies in which their genomes will be 
sequenced. But, at present, federal laws in the 
United States prohibit researchers from tell- 
ing patients about mutations that might affect 
them or their families unless a certified clini- 
cal lab has confirmed the results — something 
that is not done in most research projects. This 
means that patients often do not learn about 
their mutations until the studies are finally pub- 
lished, a restriction that is meant to ensure they 
are not misinformed by incomplete research. 

The ethical dilemmas became all too real last 
year for geneticist Gholson Lyon, a geneticist at 
the Utah Foundation for Biomedical Research 
in Salt Lake City. He was studying an extended 
family in which some of the boys had been born 
with a constellation of symptoms, including 
thick, wrinkly skin, and who ultimately died of 
cardiac disease before their first birthdays. By 
November 2010, Lyon had convincing evidence 
that a genetic mutation was causing the disease. 
That’s when he learned that one of the women in 
the family was four months pregnant with a boy. 

Lyon knew from his study that the mother 
carried the mutation. But he was not allowed 
to tell her, because the analysis had not been 
performed in a laboratory that was certified 
under the Clinical Laboratory Improvement 
Amendments, which aim to ensure that clini- 
cal tests are accurate and reliable. 

The baby was eventually born with the 
disease — called Ogden syndrome — and later 
died, in the same week that Lyon’s paper on the 
causative mutation was published’. 

At the fourth annual Personal Genomes 
meeting at Cold Spring Harbor in New York 
last week, Lyon argued that researchers should 
routinely conduct their studies in certified lab- 
oratories so that they can provide participants 
with results as soon as possible, adding that he 
plans to do so himself from now on. It is a press- 
ing issue: according to Richard Gibbs, director 
of the Human Genome 


Sequencing Center NATURE.COM 
at Baylor College of The human genome 
Medicine in Houston, at 10 


Texas, roughly 5,000 _ nature.com/humangenome 


human genomes will be sequenced this year, 
with some 30,000 expected next year. 

But ethicists point out that although 
researchers and physicians may feel obliged to 
disclose genetic information, they must also 
consider other factors. “This is not just about 
patients or doctors. These disclosures have 
societal implications that need to be consid- 
ered, including downstream cost,’ says Ellen 
Wright Clayton, director of the Center for 
Biomedical Ethics and Society at Vanderbilt 
University in Nashville, Tennessee. 

Genome sequencing is now starting to be 
used in the clinic to guide diagnosis and treat- 
ment decisions (see News Feature page 22). 

At the Medical College 


“These of Wisconsin in Mil- 
disclosures waukee, for example, 
have societal paediatrician and genet- 
implications icist David Dimmock 
that needtobe _ offers genome sequenc- 


considered.” ing to children with 
undiagnosed diseases. 
The programme is controversial because many 
researchers think that too little is known about 
how most rare genetic mutations contribute to 
disease for the knowledge to help patients. He 
points out, however, that a handful of cases have 
been reported in which sequencing has led to a 
cure or improved treatment’. 

Using a clinically certified lab, Dimmock’s 
team sequenced the genome of an infant with 
acute liver failure, and discovered that she had 
two mutations in a gene called Twinkle. Earlier 
research had linked those mutations to pro- 
gressive eye and neurological conditions, and 
an associated liver disease. As a result, doctors 
determined that a liver transplant — a standard 
treatment for acute liver failure — would not 
help the infant, and recommended against it’, 
She died when she was 6 months old. 

“This was not a happy ending — but ina 
sense it was,’ says Dimmock. Disclosing the 
genetic information spared the infant from 
spending her remaining few months recover- 
ing from a gruelling, unnecessary transplant, 
he says, and saved a scarce liver for a child who 
might benefit from it more. m 


1. Rope, A. F. et al. Am. J. Hum. Genet. 89, 28-43 
(2011). 

2. Bainbridge, M. N. etal. Sci. Trans/. Med. 3, 87re3 
(2011). 

3. Goh, V. et al. J. Pediatr. Gastroenterol. Nutr. http:// 
dx.doi.org/10.1097/mpg.0b013e318227e53c 
(2011). 
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Qa How important were the following factors in influencing your decision to have a genome analysis done? 


GENOMES BY THE NUMBERS 

Of 1,588 respondents, 289 report having taken a total of 
396 genetic tests, ranging from whole-genome sequencing 
to testing of a single gene. 


Curiosity 


Specific health risk 


For full poll results, visit go.nature.com/Q9ihtf2 oe Discipline 
General genetic influences on health = Biolo 
Qa Have you had a genome analysis and if not, would you? se 
i , Genealogical interests = Medicine 
Not sure whether Have had a = Other 


| would Ancestry 


13% 


genome analysis 


18% 


A research project I’m involved in 


Not a factor at all> 1 2 8 4 54 Major factor 


TOTAL 
NUMBER 
SURVEYED 


1,588 


Q What types of test or analysis have you had done? 


Single gene or genes 
sequenced for 
medical purposes 


| | | 
bt is is a i | 


Screening for 
DNA methylation 
status 


Whole-genome Single nucleotide polymorphism 
sequence (SNP) array (e.g. 23andMe) 
| | 


167 
] 


Other 


Have not/ Would if given Partial genome Single gene or genes Screening for copy- | don’t 
would not the opportunity (e.g. exome) sequenced for research number variation know 
15% 54% *corrected for likely reporting errors 


Nature readers flirt with 
personal genomics 


Survey reveals eagerness to use latest DNA technologies. 


BY BRENDAN MAHER 


onald Worthington lives an examined 
Ri: The genome analyst, who works 
at Southern Illinois University School 
of Pharmacy in Edwardsville, got funding 
from his university to have his entire genome 
sequenced by Complete Genomics in Moun- 
tain View, California. He also ordered the 
sequence of the coding regions of his genome, 
known as the exome, from Otogenetics in 
Tucker, Georgia, and he will soon be giving 
skin and blood samples that will be used to 
immortalize his cells and DNA. He aims 
to contribute it all to the Personal Genome 
Project, an ambitious effort to sequence 
100,000 individuals and post their data and 
medical histories online for anyone to access. 
“This approach is the way to jump-start this 
whole process of integrating human genomic 
data into clinical medicine,’ he says. 
Worthington is one of 1,588 people who 
responded last month to a Nature poll on 
readers’ attitudes to personal genomics (see 
‘Genomes by the numbers’). Participants were 
recruited by e-mail, and through Nature’s 
online Facebook and Twitter accounts, the 
Nature News Blog and Genomes Unzipped, 
an independent blog that chronicles develop- 
ments in personal genomics. It seems that, 
overall, Nature readers are eager to adopt 
these new technologies. About 18% report 


having had their genomes analysed in some 
way, ranging from whole-genome sequenc- 
ing (about 10 respondents, after correcting 
for reporting errors) to direct-to-consumer 
tests. Of the remainder, 66% say they would 
have their genome sequenced or analysed if the 
opportunity arose. 

Although scientists dominated our sample, 
only some 20% of those whose genomes had 
been analysed reported that their research 
goals or those of others were a major factor in 
the decision. About 30% did some or all of the 

analysis themselves. 


“When I found Worthington, for 
out about direct- example, isolated his 
to-consumer DNA for sequencing 
genetic testing, and is annotating his 
I thought, T’ve exome sequence. But 
got todo this’.” 50% used the services 


of 23andMe of Moun- 
tain View, California, a DNA-testing company 
that surveys each customer's genome for 1 mil- 
lion of the DNA markers known as SNPs to 
trace ancestry and to predict disease risks. 
Worthington, too, bought a 23andMe kit 
and says that the results allayed his anxieties 
about having his full genome sequenced after 
they revealed no susceptibility to serious ill- 
nesses. He also notes the naivety of his ration- 
ale, saying: “It is easily possible that next week 
a genome-wide association study will report 
that, based upon my 23andMe genotypes alone, 


I am at substantial risk for some as yet under- 
appreciated terrible disease.” 

But for other respondents, particularly 
those in medicine or public health, probing 
disease risk was a primary motivation. Kelly 
Leight, who coordinates the group Preserv- 
ing the Future of Newborn Screening, based 
in Short Hills, New Jersey, became an advocate 
of personal genomics after her daughter was 
diagnosed with late-onset congenital adrenal 
hyperplasia, a genetic disorder that can be seri- 
ous if untreated. Leight, her husband and her 
daughter were all sequenced for the causal gene 
to confirm the diagnosis. Later, she learned 
about 23andMe. “When I found out about 
direct-to-consumer genetic testing, I thought, 
‘This is totally for me. I’ve got to do this.’”” 

Like many poll respondents, she says her 
genome is now more of a hobby than anything 
else. Health-risk information from a genome 
scan did persuade her to lose 60 pounds in 
weight, putting her among the 27% of those in 
the poll who changed their behaviour because 
of information in their genome. She's spread the 
word to her family, buying genotyping kits for 
family members. She says that her geneticist 
and genetic-counsellor friends are shocked that 
she would push such a personal decision on her 
family. But Leight sees no harm. Some family 
members embraced their genomes, others 
ignored them. But “T didn’t put any pressure on 
them’, Leight says. “Well, maybe I did? m 
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Experts question 
rankings of journals 


F1000 scoring system could throw off results, say critics. 


BY DECLAN BUTLER 


eer review may be a good way to assess 
Pssss papers, but it can fall short in 
ranking the journals themselves. That’s 
the reaction of some metrics experts to the first 
such journal rankings, launched this week by 
the Faculty of 1000 (F1000) in London. Critics 
question the method, which relies on scores 
awarded to individual papers by the F1000 
‘faculty’ of 10,000 scientists and clinicians. Such 
scores, they claim, could be skewed by the inter- 
ests and enthusiasms of individual reviewers. 
Richard Grant, associate editor of F1000, 
says that the rankings give authors a valuable 
measure, complementary to journal league 
tables based on citation impact. He says 
that the first F1000 rankings will be refined, 
adding that F1000 is “constantly striving to 


improve coverage in all specialities”. 

Created in 2002, the F1000 aims to filter the 
literature by asking experts to select noteworthy 
papers and rate them as 6 (recommended), 8 
(must read) or 10 (exceptional). Now, it has 
extended the concept by totting up the scores of 
alla journal's rated articles over a given period, 
and normalizing the totals — adjusting for the 
total number of articles that the journal pub- 
lished over that period, for example. 

The results put the usual suspects at the 
top. In the rankings for 2010, the latest full 
year available, Nature leads in biology and the 
New England Journal of Medicine in medicine. 
But further down the lists, the F1000 often 
departs from impact factors. “We're aware the 
correlation with impact factors isn't exact, and 
we wouldn't expect it to be,” says Grant. The 
Proceedings of the National Academy of Sciences 


(PNAS) “does particularly well by our ranking 
because there are a lot of papers in there that 
are obviously valuable to the community”. 

But some critics say that the limited num- 
ber of papers reviewed — fewer than 20,000 
per year, of more than one million published 
— could compromise the rankings. “The 
scores may tell us as much about the composi- 
tion of the F1000 faculty as they do about the 
relative quality of various journals,’ says Carl 
Bergstrom, a biologist at the University of Wash- 
ington, Seattle, and an F1000 faculty member 
who publishes a rival metric, the Eigenfactor. 

Philip Davis, a scholarly-publishing expert 
at Cornell University in Ithaca, New York, 
says that “a single enthusiastic reviewer could 
propel a small, specialist journal into a high 
ranking simply by submitting more reviews”. 
One journal seems to owe its surprisingly high 
ranking to a series of very positive evaluations 
of its articles by its own editor. 

Grant says that such a competing interest 
should have been declared, and that the F1000 
will look into the matter. In the interim, the 
journal in question has been withdrawn from 
the rankings. The F1000 will also make its code 
of conduct more explicit, says Grant. He notes 
that all evaluations and methodology are avail- 
able on the F1000’s website, making the ranking 
process transparent and allowing users to alert 
the F1000 to any concerns. m 
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The first clinical uses of whole-genome sequencing 
show just how challenging it can be. 


BY BRENDAN MAHER 


bent at unnatural angles. She had other problems, too: a cleft palate, eight fingers, eight 

toes and no lower eyelids. She would eventually be diagnosed with Miller syndrome, a 
disease so rare that doctors have long assumed that each case arises through spontaneous muta- 
tion, rather than being passed down through families. Doctors told Jorde that her chances of 
having a second child with the syndrome were less than one in a million. 

They were wrong. Jorde’ son, born three years after his sister, had the same features. Lynn Jorde, 
Debbie’ current husband and a geneticist at the University of Utah in Salt Lake City, still cringes 
when Debbie recounts what the doctors had told her. “The right answer for that situation is that 
there have been so few cases that we really can't predict the risk; he says. 

Thanks to next-generation genome sequencing, Debbie and her children now know the fam- 
ily’s genetic risks. Lynn and his collaborators had been talking about sequencing the genomes of 
an entire ‘nuclear’ family affected by a genetic disease, both to identify the mutation responsible 
and to investigate how genes are inherited in unprecedented detail. Debbie, her former husband 
and her now-adult children, Heather and Logan Madsen, were happy to be take part, and in 2009 
became the first family in the world to have their genomes fully sequenced’. 

Over the course of six months, the research team cross-compared the whopping amount of 
DNA data from the four genomes. With the help of a parallel sequencing effort that included others 


To first thing Debbie Jorde noticed about her newborn daughter was that her arms were 
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with Miller syndrome’, the researchers identified the gene involved, called 
DHODH, which encodes a protein involved in the synthesis of nucleo- 
tides. The disease, it turns out, is recessive. In this case, both parents car- 
ried a single mutated copy of the gene, so their chance of having a child 
with the syndrome was actually one in four. The analyses also revealed 
that the children had a second recessive genetic disorder, primary ciliary 
dyskinesia, which affects lung development. Before that discovery, says 
Debbie, “We never knew why they kept getting pneumonia.” 

Families like Debbie Jorde’s are part of a small but growing vanguard 
of people, mostly with rare diseases and cancers, whose genomes have 
been sequenced to help diagnose or understand their condition. Although 
knowing the sequence didn't alter treatments for Heather and Logan, 
some individuals are being sequenced with that intent. A boy in Wiscon- 
sin was given a risky but life-saving bone-marrow transplant last year 
on the basis of a partial genome sequence’; a woman with leukaemia 
was spared a similar procedure after her genome was sequenced’; and 
genome sequencing was used to refine the therapy given to twins with a 
rare disorder (see ‘6 billion to one’)”. 

Most of those involved so far have been lucky enough to know the right 
people — researchers with an interest in clinical genetics — or deter- 
mined enough to seek them out, and many, such as Debbie Jorde’s family, 
were taking part in research projects. But now, with genome sequenc- 
ing becoming much cheaper and faster, clinical programmes are start- 
ing up around the world that will routinely analyse genomes for those 
who might benefit from the information. Illumina, which is based in San 
Diego, California, and provided the sequencing machines for many of the 
programmes, offers whole-genome sequencing for as little as US$7,500 
for people with life-threatening disease, and for $10,000 for people with 
cancers that require the sequencing of both tumour and non-cancer cells. 

As prices fall further, some say that prescribing a genome sequence 
or analysis will become akin to requesting a magnetic resonance imag- 
ing (MRI) scan. “It’s just like any other test in medicine. There's nothing 
remotely special about it” says David Bick, a clinical geneticist at the 
Medical College of Wisconsin in Milwaukee. But, he adds, “people will 
cry and scream and yell about that statement”. That's true: unlike the 
results of most medical tests, a genome sequence provides a vast amount 
of difficult-to-interpret data, not all of which will be necessary for diag- 
nosing or treating the patient’s condition and which could provide 
unwanted clues to future health risks. The few success stories published 
so far also suggest that wringing information from the human genome 
and counselling patients and their families adequately may be too big 
a burden for medical systems that are already stretched to their limits. 
“You cant immediately jump from those few profound but limited sto- 
ries and think that you can reduce this to practice for clinical care,” says 
Eric Green, director of the National Human Genome Research Institute 
(NHGRI) in Bethesda, Maryland. Still, from the pioneering cases, much 
can be learned. 


RARE BIRTHS 
Take Nicholas Volker. From the time he was born, an undiagnosed condi- 
tion ravaged his intestines, sometimes causing fistulae: holes that ran from 
his gut through to the outside of his body, leaking faeces and requiring 
surgery. By the time he turned three, Volker had been in an operating 
room more than 100 times. Doctors hypothesized that he had an immune 
deficiency and that a bone-marrow transplant might correct the problem. 
But a number of tests, including the sequencing of several genes, were 
inconclusive. After intense deliberation, a team at the Medical College 
of Wisconsin was cleared to sequence Volker’s exome, the 1-2% of the 
genome that codes for proteins and key regulatory RNA molecules. 
Using computational tools, the team combed Volker’s DNA for 
sequences that vary from person to person. They compared these with 
known variants in the general population, with variants associated with 
diseases and with related sequences in other species, looking for a muta- 
tion that might have caused the problem, says David Dimmock, a clinical 
geneticist at the college. It took, “basically one person staring at a com- 
puter for three and a half months’, he says, but eventually they identified 
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a mutation on the X chromosome in a gene called X-linked inhibitor of 
Apoptosis, or XIAP (ref. 3). A deficiency of the protein encoded by this 
gene is known to put patients at high risk for a deadly immune-cell dis- 
order, and a bone-marrow transplant suddenly became imperative. More 
than a year later, Dimmock says, Volker is doing well. 

What started as an experiment has become a programme at Wisconsin, 
where Dimmock, Bick and their colleagues now aim to provide compre- 
hensive whole-genome sequencing for patients. The team is focusing on 
people with rare disorders that are thought to involve a genetic defect, and 
in whom identifying that defect is likely to inform the course of treatment. 

Bick says that of 48 patients evaluated for the programme, 17 have 
been accepted, and their families have gone through six hours or more of 
genetic counselling before sequencing. Insurance companies have agreed 
to foot the bill for at least two of the cases. Their rationale is straight- 
forward, says Tina Hambuch, a senior scientist at Ilumina’s clinical 
services laboratory, which has been doing the sequencing for this pro- 
gramme. A full genome sequence can be less expensive than a series of sin- 
gle genetic tests, and might clarify whether a costly treatment is required. 
“There are cases where it’s cost effective,’ Hambuch says. 


GENOME FACTORIES 
Other institutions are following suit. In the United Kingdom, the 
Wellcome Trust Centre for Human Genetics at the University of Oxford 
has made plans with Illumina to sequence 500 genomes from people 
— some from the same family — with a wide range of conditions. The 
Undiagnosed Diseases Program at the National Institutes of Health in 
Bethesda has been running a sequencing programme since 2008. It has 
sequenced more than 140 exomes and 5 genomes in its attempts to find 
the molecular underpinnings of diseases that have eluded diagnosis. The 
programme was so overwhelmed by interest that it temporarily stopped 
accepting applications a few months ago. 

Green says that “now is the time to push the accelerator”. Clinical 
geneticists often talk about tackling Mendelian disorders: diseases 
thought to involve a single gene and that roughly obey the rules of inher- 


“WE'VE LEARNED ALOT ABOUT HOW 
HARD EVALUATING AN EXOMEIS.” 


itance drawn up by Gregor Mendel in the nineteenth century. These 
conditions may account for as many as 20% of paediatric hospitalizations 
worldwide and a large share of health-care costs. Yet their genetic basis 
is often unknown. The compendium of such conditions, called Online 
Mendelian Inheritance in Man (OMIM), currently contains just under 
7,000 disorders, about half of which have been assigned a molecular cause. 
This autumn, Green says, the NHGRI will announce the winners of its 
Mendelian Disorders Genome Centers grants, which will fund sequenc- 
ing centres looking for causes of the rest. 

Still, many researchers worry that it will be difficult to make clinical 
use of most genomes. At the Undiagnosed Diseases Program, the misses 
have certainly outnumbered the hits so far. “I think we've learned a lot 
about how hard evaluating an exome is,” says Thomas Markello, from the 
medical-genetics branch of the NHGRI. “I’m most concerned that people 
don't recognize that what's been published to date are the success stories.” 

Many researchers say that genome sequencing could be used in 
diagnosis and therapy of cancer more easily than in rare diseases. Clini- 
cians are already doing sophisticated analyses of some tumours in order to 
tailor therapies to the patient’s genetic characteristics; a genome sequence 
provides even more molecular detail. For example, an individual's cancer 
genome sometimes reveals defects in a pathway that might point to use of 
a known drug, but were not apparent from standard tests. 

In 2007, a 78-year-old man in Canada with a rare tongue cancer that 
had spread throughout his body was being treated at the British Columbia 
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Cancer Agency in Vancouver. There was no approved treatment for his 
type of cancer and — being what doctors described as a “savvy sort” — 
he and his clinician convinced scientists at the agency to sequence the 
cancer'’s genome. The scientists also analysed its transcriptome, revealing 
both the sequence and the amount of RNA that the tumour was produc- 
ing. The team then compared these data with those for other cancers and 
for the patient's normal cells. 

The researchers homed in on RET, a gene known to promote cancer, 
which was duplicated in the tumour genome and churning out RNA. Sev- 
eral drugs are known to inhibit the protein encoded by this gene. Marco 
Marra, director of the cancer agency's genome-sciences centre, says that 
“after much agonizing and hand-wringing’, the clinical team prioritized 
these drugs and tried the top one, sunitinib. The cancer stabilized for 
several months on this and a second treatment, but eventually started to 
spread again. An analysis of the recurring tumours showed that different 
cancer-promoting pathways had been activated’, making the tumours 
resistant to the first drug, but possibly responsive to others. Unfortunately, 
by then it was too late to do more: the man died. 


UNSTOPPABLE TRAIN 
Marra’s group is now setting up a project to better diagnose subtypes 
of another cancer, acute myelogenous leukaemia, using transcriptome 
and other sequencing methods. Partly inspired by Marra’ efforts, Elaine 
Mardis, a geneticist at Washington University in St Louis, Missouri, and 
her collaborators have used genome sequencing to try to help a handful 
of people with cancer, including the woman with leukaemia’. The woman 
had been treated and had gone into remission, but standard tests were 
unable show conclusively whether she had acute promyelocytic leukaemia 
(APL) — which generally has a good outcome with standard therapy — or 
a type of leukaemia that would require aggressive follow-up treatment, 
suchas a bone-marrow transplantation. Over about seven weeks, the team 
sequenced the cancer’s genome and found a gene fusion that was consist- 
ent with APL. Mardis is enthusiastic about the approach, but notes its 
limitations. “It’s another piece of evidence,’ she says. “It’s not going to be 
the only thing that you're looking at when going to a patient diagnosis.” 
Moving whole-genome sequencing from research to clinic is beset 
with challenges. Unlike in research, DNA sequencing that is intended to 
inform a diagnosis must be done in accredited laboratories, such as those 
used by Illumina. The institutional review boards 
that oversee research in humans have not reached a 


To hear more consensus on whether approval is needed for clini- 
about genomes on cal genome sequencing; and the US Food and Drug 
prescription: Administration is yet to work out how to regulate 


the coming wave of clinical sequencing. 
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Single-base variants shared by twins that 
differ from reference human genome 


Variants that code for proteins 


Variants that change 
amino-acid sequence 


Rare variants (which 
are more likely to 
cause disease) 


Candidate genes 


Many researchers and clinicians worry that health systems don’t have 
enough people well versed in genomics or bioinformatics to interpret 
the flood of data. What's more, say experts, function and disease infor- 
mation for the human genome is scattered across scientific articles 
and databases that are hard to troll through and aren't always correct. 
Sequence analysis is where most costs now lie. Hambuch says that for 
the few research projects on which Illumina has collaborated, just iden- 
tifying all the variants in a genome has taken two to three weeks. “That's 
a lot of effort from high-skill people,’ she says. 

The information could also overwhelm patients. Medical geneticists 
and ethicists have long worried about finding genetic pointers to disease 
risks that are unrelated to the illness being treated. With a full genome 
sequence, the likelihood of such incidental findings shoots up. The situ- 
ation is particularly tricky for young patients. Do parents have the right 
to decide for them what information is revealed? This is where many of 
those hours of genetic counselling are spent, says Bick. 

For these reasons, Stephen Kingsmore at Children’s Mercy Hospital in 
Kansas City, Missouri, argues that clinical sequencing should be limited 
in scope. He advocates sequencing just what he calls the Mendelianome, 
the genetic regions known to be involved in inherited diseases. “Ethically, 
legally, socially that’s going to be more acceptable,” he says. His group is 
developing methods that use a panel of mutations associated with just 
over 600 recessive diseases for such screening. Doing much more than 
this, he says, puts research goals ahead of the patient. 

But some geneticists think that the train is unstoppable. “Once you 
demonstrate how informative this technology is, I think this is going to be 
a widely adopted,’ says Hakon Hakonarson, who is starting a programme 
for clinical assessment of genomes at the Children’s Hospital of Philadel- 
phia in Pennsylvania. 

The members of Debbie Jorde’s family still ponder what their genome 
sequences have meant for them. Although the sequences didnt alter treat- 
ment, if they had known about the lung problem earlier it might have 
prevented a dangerous procedure that both Heather and Logan under- 
went to reduce the recurrence of pneumonia. 

Still, Lynn Jorde thinks that more successes are on the way for genomes 
in clinical care. “Id predict some spectacular applications.” But, he adds, 
“Tm acongenital optimist’: = 


Brendan Maher is a features editor for Nature. 
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A surge in withdrawn papers is highlighting 
weaknesses in the system for handling them. 


his week, some 27,000 freshly published 
research articles will pour into the Web 
of Science, Thomson Reuters’ vast 
online database of scientific publica- 
tions. Almost all of these papers will stay 
there forever, a fixed contribution to the 
research literature. But 200 or so will eventu- 
ally be flagged with a note of alteration such as 
a correction. And a handful — maybe five or 
six — will one day receive science's ultimate 
post-publication punishment: retraction, the 
official declaration that a paper is so flawed 
that it must be withdrawn from the literature. 

It is reassuring that retractions are so rare, 
for behind at least half of them lies some shock- 
ing tale of scientific misconduct — plagiarism, 
altered images or faked data — and the other 
half are admissions of embarrassing mistakes. 
But retraction notices are increasing rapidly. 
In the early 2000s, only about 30 retraction 
notices appeared annually. This year, the Web 
of Science is on track to index more than 400 
(see ‘Rise of the retractions’) — even though 
the total number of papers published has risen 
by only 44% over the past decade. 

Perhaps surprisingly, scientists and editors 
broadly welcome the trend. “I don't think there's 
any doubt that we're detecting more fraud, and 
that systems are more responsive to misconduct. 
It’s become more acceptable for journals to step 
in,” says Nicholas Steneck, a research ethicist at 
the University of Michigan in Ann Arbor. But 
as retractions become more commonplace, 
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stresses that have always existed in the system 
are starting to show more vividly. 

When the UK-based Committee on Pub- 
lication Ethics (COPE) surveyed editors’ 
attitudes to retraction two years ago, it found 
huge inconsistencies in policies and practices 
between journals, says Elizabeth Wager, a 
medical writer in Princes Risborough, UK, 
who is chair of COPE. That survey led to 
retraction guidelines that COPE published in 
2009. But it’s still the case, says Wager, that 

“editors often have to be pushed to retract”. 

Other frustrations include opaque retrac- 
tion notices that don't explain why a paper has 
been withdrawn, a tendency for authors to 
keep citing retracted papers long after they've 
been red-flagged (see “Withdrawn papers live 
om) and the fact that many scientists hear 
‘retraction and immediately think ‘miscon- 
duct’ — a stigma that may keep researchers 
from coming forward to admit honest errors. 

Perfection may be too much to expect from 
any system that has to deal with human error 
in all its messiness. As one journal editor told 
Wager, each retraction is “painfully unique’. 

But as more retractions hit the headlines, 
some researchers are calling for ways to 
improve their handling. Suggested reforms 
include better systems for linking papers to 
their retraction notices or revisions, more 
responsibility on the part of journal editors 
and, most of all, greater transparency and 
clarity about mistakes in research. 
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The reasons behind the rise in retractions 
are still unclear. “I don’t think that there is sud- 
denly a boom in the production of fraudulent 
or erroneous work,’ says John Ioannidis, a 
professor of health policy at Stanford Univer- 
sity School of Medicine in California, who has 
spent much of his career tracking how medical 
science produces flawed results. 

In surveys, around 1-2% of scientists admit 
to having fabricated, falsified or modified data 
or results at least once (D. Fanelli PLoS ONE 4, 
e5738; 2009). But over the past decade, retrac- 
tion notices for published papers have increased 
from 0.001% of the total to only about 0.02%. 
And, Ioannidis says, that subset of papers is “the 
tip of the iceberg” — too small and fragmentary 
for any useful conclusions to be drawn about 
the overall rates of sloppiness or misconduct. 

Instead, it is more probable that the growth 
in retractions has come from an increased 
awareness of research misconduct, says 
Steneck. That's thanks in part to the setting 
up of regulatory bodies such as the US Office 
of Research Integrity in the Department of 
Health and Human Services. These ensure 
greater accountability for the research insti- 
tutions, which, along with researchers, are 
responsible for detecting mistakes. 

The growth also owes a lot to the emergence 
of software for easily detecting plagiarism 
and image manipulation, combined with the 
greater number of readers that the Internet 
brings to research papers. In the future, wider 
use of such software could cause the rate of 
retraction notices to dip as fast as it spiked, 
simply because more of the problematic 
papers will be screened out before they reach 
publication. On the other hand, editors’ 
newfound comfort with talking about retrac- 
tion may lead to notices coming at an even 
greater rate. 

“Norms are changing all the time,” says 
Steven Shafer, editor-in-chief of the journal 
Anesthesia & Analgesia, who has participated 
in two major misconduct investigations — 
one of which involved 11 journals and led to 
the retraction of some 90 papers. 


But willingness to talk about retractions is 
hardly universal. “There are a lot of publish- 
ers and a lot of journal editors who really 
don’t want people to know about what’s 
going on at their publications,’ says New 
York City-based writer Ivan Oransky, execu- 
tive editor at Reuters Health. In August 2010, 
Oransky co-founded the blog Retraction 
Watch with Adam Marcus, managing edi- 
tor at Anesthesiology News. Since its launch, 
Oransky says, the site has logged 1.1 mil- 
lion page views and has covered more than 
200 retractions. 

In one memorable post, the reporters 
describe ringing up one editor, L. Henry 
Edmunds at the Annals of Thoracic Surgery, 
to ask about a paper withdrawn from his 


journal (go.nature.com/ubv261). “It’s none of 
your damn business!” he told them. Edmunds 
did not respond to Nature’s request to talk for 
this article. 

The posts on Retraction Watch show how 
wildly inconsistent retractions practices are 
from one journal to the next. Notices range 
from informative and transparent to deeply 
obscure. A typically unhelpful example of the 
genre would be: “This article has been with- 
drawn at the request of the authors in order 
to eliminate incorrect information.” Oransky 
argues that such obscurity leads readers to 
assume misconduct, as scientists making an 
honest retraction would, presumably, try to 
explain what was at fault. 


FEATURE | NEWS 


To Drummond Rennie, deputy editor of 
the Journal of the American Medical Associa- 
tion, there are two obvious reasons for obscure 
retraction notices: “fear and work” 

The fear factor, says Wager, is because pub- 
lishers are very frightened of being sued. “They 
are incredibly twitchy about publishing any- 
thing that could be defamatory,’ she says. 

‘Work refers to the phenomenal effort 
required to sort through authorship disputes, 
concerns about human or animal subjects, 
accusations of data fabrication and all the other 
ways a paper can go wrong. “It takes dozens or 
hundreds of hours of work to get to the bot- 
tom of what's going on and really understand 
it,” says Shafer. Because most journal editors 


In the past decade, the number of retraction notices has shot up 10-fold (top), even as the literature 
has expanded by only 44%. It is likely that only about half of all retractions are for researcher 


misconduct (middle). Higher-impact journals have logged more retraction notices over the past decade, 
but much of the increase during 2006-10 came from lower-impact journals (bottom). 
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are scientists or physicians working on a vol- 
untary basis, he says, that effort comes out of 
their research and clinical time. 

But the effort has to be made, says Steneck. 
“If you don't have enough time to do a 
reasonable job of ensuring the integrity of your 
journal, do you deserve to be in business as a 
journal publisher?” he asks. Oransky and Mar- 
cus have taken a similar stance. This summer, 
for example, Retraction Watch criticized the 
Journal of Neuroscience for a pair of identi- 
cal retraction notices it published on 8 June: 
“At the request of the authors, the following 
manuscript has been retracted” 

But the journal’s editor-in-chief, 
neuroscientist John Maunsell of Harvard Med- 
ical School in Boston, Massachusetts, argues 
that such obscurity is often the most respon- 
sible course to take. “My feeling is that there 
are far fewer retractions than there should be,” 
says Maunsell, who adds that he has conducted 
79 ethics investigations in more than 3 years at 
the journal — 1 every 2-3 weeks. But “authors 
are reluctant to retract papers’, he says, “and 
anything we put up in the way ofa barrier or 
disincentive is a bad thing. If authors are hap- 
pier posting retractions without extra informa- 
tion, Id rather see that retraction go through 
than provide any discouragement.’ 

At the heart of these arguments, says 
Steneck, lie shifting norms of how responsible 
journal editors should be for the integrity of 
the research process. In the past, he says, “they 
felt that institutions and scientists ought to do 
it”. More and more journal editors today are 
starting to embrace the gatekeeper role. But 
even now, Shafer points out, they have only 
limited authority to challenge institutions that 
are refusing to cooperate. “I have had institu- 
tions, where I felt there was very clear miscon- 
duct, come back and tell me there was none,” 
Shafer says. “And I have had a US institution 
tell me that they would look into allegations of 
misconduct only if] agreed to keep the results 
confidential.” 


Discussions on Retraction Watch make it 
clear that many scientists would like to sepa- 
rate two aspects of retraction that seem to 
have become tangled together: cleaning up the 
literature, and signalling misconduct. After 
all, many retractions are straightforward and 
honourable. In July, for example, Derek Stein, a 
physicist at Brown University in Providence, 
Rhode Island, retracted a paper in Physical 
Review Letters on DNA in nanofluidic chan- 
nels when he found that a key part of the 
analysis had been performed incorrectly. His 
thoroughness and speed — the retraction came 
just four months after publication — were 
singled out for praise on Retraction Watch. 
But because almost all of the retractions that 
hit the headlines are dramatic examples of mis- 
conduct, many researchers assume that any 
retraction indicates that something shady has 


Withdrawn papers live on 


In theory, retracting a paper is tantamount 
to withdrawing it from the scientific 
literature, so that it will never again mislead 
anyone. But when John Budd, at the School 
of Education at the University of Missouri in 
Columbia, examined 235 articles retracted 
during 1966-96, he found that they were 
cited in total more than 2,000 times after 
their withdrawal, with fewer than 8% of the 
citations acknowledging the retraction. And 
the rates haven’t improved much in the age 
of electronic publication: in a preliminary 
analysis of 1,112 retracted papers during 
1997-2009, Budd finds them cited just as 
often, with the retraction mentioned in only 
about 4% of the citations. Other studies 
suggest that the situation is even worse for 
corrections, which are more numerous and 
often add important updates to a paper. 
One solution is being developed by 
CrossRef, a non-profit collaboration of 3,599 
commercial and learned-society publishers. 
It tries to address the fact that many 
researchers today never see corrections 
or retraction notices because they just 
download digital, PDF-formatted copies 
of the papers they need, and never again 
consult the original source. A new system, 
called CrossMark, consists of a logo that 
publishers will put on every PDF. Clicking 
on the logo will show Internet-connected 
users any updates to the work, whether 
retractions, corrections or other notes. The 
project is expected to launch in early 2012. 


occurred. And that stigma may dissuade hon- 
est scientists from doing the right thing. One 
American researcher who talked to Nature 
about his own early-career retraction said he 
hoped that his decision would be seen as a badge 
of honour. But, even years later and with his 
career established, he still did not want Nature 
to use his name or give any details of the case. 

There is no general agreement about 
how to reduce this stigma. Rennie suggests 
reserving the retraction mechanism exclusively 
for misconduct, but that would require the 
creation ofa new term for withdrawals owing to 
honest mistakes. At the other extreme, Thomas 
DeCoursey, a biologist at Rush University 
Medical Center in Chicago, argues for 
retraction of any paper that publishes results 
that are not reproducible. “It does not matter 
whether the error was due to outright fraud, 
honest mistakes or reasons that simply cannot 
be determined,” he says. 

A better vocabulary for talking about 


That will help researchers become aware 
of updates that have been recorded. But 
most science doesn’t progress by revising 
its written record. Papers superseded by 
later work, or that are controversial for 
some reason, are usually never flagged; the 
status quo remains that researchers are 
left to learn about them by soaking up the 
lore in their particular community. “There 
is nothing more irritating than publishing 
a paper that completely disproves every 
major conclusion of a study, and then 
years later seeing reviews or other papers 
cite the original (wrong) study, without the 
authors being aware that any doubts were 
ever raised,” says Thomas DeCoursey, a 
biologist at Rush University Medical Center 
in Chicago. In 2006, he raised questions 
about research that had been published in 
Nature two years earlier; the paper was not 
retracted until November 2010. 

lvan Oransky, executive editor at 
Reuters Health and co-founder of the blog 
Retraction Watch, feels that such difficulties 
are just symptoms of a wider issue with 
the reward system of academic research: 
publications are the only way to accrue 
scientific merit, so they take on a sanctity 
that academics are reluctant to disrupt with 
corrections or retractions. If researchers 
could afford to view scientific output more 
as a continuous stream, rather than a 
punctuated series of publications, revisions 
would carry less of a stigma, he says. 


useful would be a database for classifying 
retractions. “The risk for the research com- 
munity is that if it doesn’t take these problems 
more seriously, then the public — journalists, 
outsiders — will come in and start to poke at 
them,” he points out. 

The only near-term solution comes back 
to transparency. “If journals told readers 
why a paper was retracted, it wouldn't matter 
if one journal retracted papers for misconduct 
while another retracted for almost anything,” 
says Zen Faulkes, a biologist at the University 
of Texas—Pan American in Edinburg, Texas. 

Oransky agrees. “I think that what we're 
advocating is part of a much larger phenom- 
enon in public life and on the Web right now,’ 
he says. “What scientists should be doing is 
saying, ‘In the course of what we do are errors, 
and among us are also people that commit 
misconduct or fraud. Look how small that 
number is! And here's what we're doing to root 


2» 


that out?” = 


retractions is needed, says Steneck — one 
acknowledging that retractions are just as 
often due to mistakes as to misconduct. Also 
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Phosphate is mined to produce fertilizers for crops, but phosphorus leaching into water supplies is an environmental hazard. 


A broken 


biogeochemical cycle 


Excess phosphorus is polluting our environment while, ironically, 
mineable resources of this essential nutrient are limited. James Elser and 
Elena Bennett argue that recycling programmes are urgently needed. 


r | Yo meet our demands for energy, 
humankind has moved masses of 
carbon from deep underground into 

the atmosphere, wreaking havoc with the 

climate. To meet our demand for food, we 
have moved large amounts of nitrogen from 
the atmosphere to fields, rivers and forests, 
devastating ecosystems. To grow our crops 
we have interfered with Earth's reserves of 

a third element — phosphorus — which 

receives much less press and for which we 

face the unique problem of having both too 
much and too little. 
Since the middle of the twentieth century, 


humanity has quadrupled the environmen- 
tal flow of phosphorus’, an essential element 
for all forms of life. We dug up geological 
phosphate reserves to produce fertilizers 
to feed the Green Revolution, creating a 
largely one-way flow of phosphorus from 
rocks to farms to lakes and oceans, and dra- 
matically impairing freshwater and coastal 
marine ecosystems. Globally, oxygen- 
depleted marine coastal ‘dead zones’ caused 
by nutrient-stimulated algal blooms con- 
tinue to expand. The Gulf of Mexico's dead 
zone, averaging more than 17,000 square 
kilometres in recent years,was forecast to 
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reach record dimensions this year before a 
tropical storm stirred the waters. 

At the same time, concern is growing 
about how long we can count on cheap 
supplies of phosphorus for fertilizer: easily 
mineable deposits of phosphate rock are 
limited. Unlike nitrogen, phosphorus can- 
not be pulled from the air and, unlike the 
carbon in our energy system, there is no 
known replacement. In 2009, Dana Cordell 
of the University of Technology in Sydney, 
Australia, and her colleagues published a 
‘peak phosphorus’ forecast’ that predicted 
maximum production around 2030 —an > 
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> alarmingly imminent forecast in the light 
of widespread riots in 2008 sparked by food 
prices, and a 700% increase in phosphate 
rock prices from 2007 to 2008. 

These issues are not entirely new. In 1938, 
US President Franklin Roosevelt said it was 
“high time for the Nation to adopt a national 
policy for the production and conservation 
of phosphates for the benefit of this and 
coming generations’. Astonishingly, such a 
comprehensive policy never emerged, 
although in the 1970s, the Tennessee Val- 
ley Authority set up the National Fertilizer 
Development Center to study and expand the 
production and use of phosphate fertilizers. 
This was the forerunner of the International 
Fertilizer Development Center (IFDC), 
headquartered in Muscle Shoals, Alabama. 

New research initiatives are emerging to 
tackle the two faces of phosphorus, including 
the Sustainable Phosphorus Initiative at Ari- 
zona State University in Tempe, of which one 
ofus (J. E.) isa co-founder. The world contin- 
ues to face deteriorating water quality, uncer- 
tainty about future supplies of phosphorus 
and uncoordinated institutional frameworks. 
So we need to move quickly beyond academic 
discussions to creative policy solutions. 


POWER IMBALANCE 

Estimates of how much readily accessible 
phosphorus is left have increased since the 
wave of concern in 2009. But uncertainties 
surrounding phosphorus supply remain 
unreasonably large. Last year, economic geol- 
ogist Steven Van Kauwenbergh at the IFDC 
produced an assessment of global phosphorus 
reserves’, which, by incorporating previously 
overlooked geological reports from the 1980s, 
greatly increased the estimated reserves for 
Morocco and its disputed territory of West- 
ern Sahara. This led the US Geological Sur- 
vey to increase its estimate of accessible global 
phosphate rock reserves by more than four- 
fold, from around 15 billion tonnes to around 
65 billion tonnes. It is disturbing that these 


GLOBAL IMBALANCE 


numbers can change by so much so quickly. 

More important than the amount of phos- 
phorus in the ground, is how much it will cost 
to get it out. Overall, three countries control 
more than 85% of the known global phos- 
phorus reserves, with Morocco clearly in the 
driver’s seat® (see ‘Global imbalance’). This 
concentration of power is far greater than for 
oil, where the dozen members of the Organi- 
zation of the Petroleum Exporting Countries 
control 80% of the world’s oil reserves. Such a 
power imbalance is a potential source of ten- 
sion, given the political turmoil in northern 
Africa and the fact that developing-world 
farmers cannot afford phosphate fertilizers 
even at today’s non- 


monopoly prices. “More 

Major regions of the important 
world have diminish- than the 

ing (United States), amount of 

few (India) or no phosphorusin 
(northern Europe) the ground is 
phosphorus reserves how muchit 
of their own. Manyof will cost to 


the world’s food pro- 
ducers are in danger 
of becoming completely dependent on trade 
with Morocco, where press reports have 
emerged of Dubai-style luxury developments 
being planned in anticipation of phosphorus 
windfalls. 

The strategic dimensions of phospho- 
rus are beginning to be recognized. In May 
this year, a workshop sponsored by the US 
Department of Energy included phospho- 
rus alongside dysprosium, yttrium and other 
rare earth elements of crucial importance to 
US national security that face potential sup- 
ply bottlenecks. Indeed, phosphorus may be 
included as a ‘strategic material in pending 
US legislation to assess and secure access to 
sources of key minerals. 


getit out.” 


RECYCLE, REDUCE 
The solutions to these problems lie in recap- 
turing and recycling phosphorus, moving it 


Morocco holds the vast majority of global supplies of phosphorus; but these estimates can change 


disturbingly quickly. 
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from where there is too much to where there 
is too little, and developing ways to use it 
more efficiently. Many strategies are simple 
and readily available, even for poor farmers 
and developing economies. 

Consider the fate of the approximately 
17.5 million tonnes of phosphorus mined 
in 2005, analysed in the paper by Cordell 
et al.”. About 14 million tonnes of this were 
used in fertilizer (much of the rest went into 
cattle-feed supplements, food preserva- 
tives, and the production of detergents and 
industrial cleaning agents) but only about 
3 million tonnes made it to the fork (or chop- 
stick). The largest loss — around 8 million 
tonnes — was directly from farms through 
soil leaching and erosion. Much research and 
effort has already been expended to reduce 
such losses, including more precise timing 
and placement of fertilizer along with no- 
till cultivation, but adoption of these best 
practices needs to become more widespread. 
In 2009, the total phosphorus mined had 
increased to 23 million tonnes, but the gen- 
eral phosphorus pathways and losses have 
not changed much since the earlier analysis. 

On average, about 30-40% of food pro- 
duced is spoiled or wasted, and this wastes 
around 1 million tonnes of phosphorus every 
year’. Producing more food within or closer 
to cities could reduce waste and facilitate recy- 
cling by composting and other approaches. 

We can also recycle phosphorus from 
human waste. Each person excretes about 
1.2 grams of phosphorus per day’; trapping 
all of this globally would produce about 
3 million tonnes per year — about 20% 
of annual worldwide phosphate fertilizer 
consumption. Currently, only 10% of phos- 
phorus from human waste is returned to 
agriculture; one method involves extracting 
‘struvite’ (magnesium ammonium phos- 
phate) at sewage-treatment plants and 
processing it into fertilizer pellets. 

Urine-separating toilets and latrines can 
help to capture nutrients for return to the soil 
as well as improve sanitation in the devel- 
oping world. Already deployed in Europe, 
the NoMix toilet captures urine in the front 
and faeces in the back, diverting the urine 
for recycling at household, neighbourhood 
or city scale. Urine-separating latrines are 
now being installed on a relatively large scale 
in Durban, South Africa, funded by a grant 
from the Bill & Melinda Gates Foundation. 
Another low-cost solution is the Peepoo, a 
single-use, self-sanitizing, biodegradable 
bag that captures human excreta and can 
be used, or even sold, as fertilizer 2-4 weeks 
later. For the poorest of the poor, it would be 
a potentially radical transformation if their 
own ‘waste’ could become a source of income. 

According to Cordell et al.?, more than 
7 million tonnes of phosphorus were 
released into the environment annually in the 
2000s through animal manure and excreta, 


C. WIRSEEN/PEEPOOPLE 


causing major water-quality problems. Even 
if manure were collected, large livestock 
farms are now often too far away from arable 
land for transport of the heavy wastes to be 
economically viable. Combined bioenergy 
and waste-trapping technologies can help. 
For example, bioreactors developed by Bion 
Environmental Technologies in Crestone, 
Colorado, are being used at a large dairy 
operation in Pennsylvania to prevent nutrient 
run-off to Chesapeake Bay; recovered nutri- 
ents are slated for return to farms. However, as 
with manure, transport costs could limit the 
scalability of this approach. Market incentives 
might help to make struvite-recovery sys- 
tems, such as those developed for municipal 
wastewater-treatment plants, economic for 
high-density livestock operations. 

Reducing the phosphorus requirement 
for crops and livestock would make it easier 
for sources of recycled phosphorus to meet 
agricultural demand. One way to do this is to 
encourage people to switch to vegetarianism: 
producing a vegetarian’s diet requires 1 kilo- 
gram less phosphorus per year than a meat- 
eater’s. This might be a difficult social change 
to enact, however. 

Researchers have recently engineered some 
plants to increase their ability to scavenge 
nutrients, including phosphorus, from soils. 
One approach, published earlier this year, 
modified Arabidopsis, tomato, rice, alfalfa 
and cotton to overexpress proton-translo- 
cating enzymes called pyrophosphatases, 
which leads to more elaborate root systems 
and higher production of leaves, fruit and 
seeds, among other things’. So far, these 


approaches have been 

limited to thelaband “Dubai- 

to test plots. style luxury 
What about live- developments 

stock? Promisingly, are being 

the gene for bacterial plannedin 

phytase, an enzyme anticipation 

that breaks down of phosphorus 


the phosphorus-rich 
compounds called 
phytates, has been introduced into a line 
of Yorkshire pigs, creating the Enviropig’. 
These pigs produce phytase in their saliva, 
which allows them to make use of the other- 
wise undigestible phytates in feed grains. The 
transgenic pigs produce up to 75% less phos- 
phorus in their manure than do non-trans- 
genic pigs, and do not need phosphate added 
to their diet. Twelve years after development, 
the technology is still working its way through 
federal approval processes in the United States 
and Canada. 


windfalls.” 


POLICY MEASURES 

Together, these measures would help to cut 
phosphorus-containing waste, enhancing 
food security while also reducing the pol- 
luting effects of phosphorus run-off. But the 
gaping institutional vacuum for phosphorus 


The Peepoo biodegradable bag captures human 
excreta and can be used as fertilizer. 


governance must also be plugged. Society 
needs more rigorous, independently verified 
estimates of pools and fluxes of this critical 
element as well as reliable ways of estimating 
the affordability of remaining stocks. 

As illustrated by the recent radical adjust- 
ment of reserve estimates, we barely know 
how much phosphorus we have. Current 
methods to gauge phosphorus reserves rely 
largely on voluntary provision of propri- 
etary data by private industry and govern- 
ment agencies and, as in the case of Morocco, 
are often based on relatively old geological 
assessments. There are other disturbing 
uncertainties’. Estimates of major phos- 
phorus fluxes, such as the loss of phospho- 
rus from agricultural lands, span a three- to 
fivefold range; others, such as the global 
return of phosphorus from harvested crops 
to farms, are essentially unknown. 

Attempting to bridge these gaps, new 
networks of sustainability scientists have 
come together — as yet loosely organized 
and lightly funded. Among the first was the 
Global Phosphorus Research Initiative (now 
the Global Phosphorus Network), which 
produced some of the first estimates of 
potential time frames for phosphorus scar- 
city’. In 2010, the Global Transdisciplinary 
Processes for Sustainable Phosphorus Man- 
agement consortium emerged, to connect 
scientists, industry, business, and govern- 
ment groups at each point of the phospho- 
rus supply and use chain. The Sustainable 
Phosphorus Initiative sponsored a summit 
in February 2011 which produced the Phoe- 
nix Phosphorus Declaration: a consensus of 
more than 100 scientists, engineers, archi- 
tects, designers, farmers, entrepreneurs, 
artists and communicators on the urgency 
and opportunities associated with achieving 
phosphorus sustainability. 

Sadly, the message hasn't yet sunk in where 
it counts. The 2009 United Nations Food and 
Agriculture Report and the 2010 report of the 
US National Research Council Committee 
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on Twenty-First Century Systems Agricul- 
ture breathe hardly a word about fertilizer 
supplies, prices and access, instead focusing 
on the impacts of fertilizer run-off on water 
quality, and generally emphasizing the effects 
of excess nitrogen rather than phosphorus. 
More promisingly, both Sweden and Ger- 
many are implementing ambitious directives 
to recycle up to 60% of wastewater phospho- 
rus, with half of it to be returned to farms and 
the rest to pastures or forest plantations. 

To move further and faster, we call for the 
establishment of a comprehensive network 
of national and international science and 
policy research centres for nutrient sus- 
tainability, which should also tackle nitro- 
gen (which has a ‘too much’ problem’) and 
potassium (which may have similar geo- 
political issues of ‘too little’). Such centres 
should pursue research both on fundamental 
biogeochemical processes in agriculture and 
on possible policy actions, working closely 
with practitioners and policy-makers. 

One idea is to create phosphorus-emission 
markets, similar to carbon markets. Some 
jurisdictions, including 13 US states and 
some Australian states, are considering these 
now, primarily to protect water quality. For 
example, in an arrangement under discussion 
in Maryland, an advanced wastewater-treat- 
ment plant that surpasses federal guidelines 
for nutrient releases could, through a private 
broker, sell permissions to release nutrients to 
other municipalities whose facilities cannot 
currently meet the targets. 

Another, as yet unconsidered, idea might 
be the creation of national or international 
strategic phosphorus reserves, similar to the 
petroleum reserve, to stabilize commodity 
prices. 

As Roosevelt said in 1938: “I cannot over- 
emphasize the importance of Phosphorus 
not only to agriculture and soil conservation 
but also to the physical health and economic 
security of the people of the nation.” Nearly 
75 years later, it is time to find long-term 
solutions for one of life’s essential elements. m 
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The metric system was invented in France in the early 1790s but took 200 years to become dominant. 


Scaling 
Andrew Robinson applauds a chronicle of metrication 


that balances physics with philosophy. 
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he metric system of measurement 
| was invented by French scientists in 
the early 1790s and imposed on the 
populace by the leaders of the French Revo- 
lution. For about a year, the revolutionar- 
ies even attempted to introduce a decimal 
clock, with each day divided into ten hours, 
each hour into 100 minutes and each minute 
into 100 seconds. Napoleon Bonaparte later 
congratulated the scientists: “Conquests will 
come and go, but this work will endure” 

Yet Napoleon himself refused to use metric 
units. In 1812, he ordered their official with- 
drawal. And, after falling from power in 1815, 
he attacked the system as a “stumbling block” 
to progress. Most of the French people also 
rebelled. Not until 1840 did the government 
again dare to make use of metric measures 
obligatory. 

From France, the metric system spread 
haltingly around the globe, as World in the 
Balance documents. Robert Crease, a philo- 
sopher at Stony Brook University in New 
York, relates the history of measurement from 
ancient China to current debates over defin- 
ing the kilogram by reference to physical con- 
stants. His respect for both the philosophical 
and physical aspects of measurement adds 
tension to his account. 

Crease explains how growing industr- 
ialization and mechanization promoted the 
metre and suchlike. The advantages were also 
highlighted by events such as the Great Exhi- 
bition in London in 1851, which revealed the 
incompatibility of different national measure- 
ment systems, and by cheerleading editorials 
in Nature from the 1870s. 

In 1875, representatives of 17 nations and 
empires met in Paris to sign the Conven- 
tion of the Metre, “desiring international 
uniformity and precision in standards of 
weight and measure” and establishing the 
International Bureau of Weights and Meas- 
ures (BIPM) at Sévres. The United Kingdom 
and its colonies signed only in 1884, but in 
practice continued to use imperial measures 
for nearly another century. The Soviet Union 
officially went metric in the 1920s; Japan in 
the 1950s. The BIPM introduced the Inter- 
national System of Units (SI units, from the 
French) in 1960. Not until the final decades 
of the twentieth century did the metric sys- 
tem become dominant around the world. 

Only the United States, Myanmar (Burma) 
and Liberia have yet to enact legislation to 
metricate. Even so, scientists and many of 
the public in the United States use the metric 
system daily. The third US president, Thomas 
Jefferson — Francophile, fanatical quantifier 
and admirer of the metric system — tried to 
convert his countrymen, but eventually gave 
up. “Shall we mould our citizens to the law, or 
the law to our citizens?” he wrote in an 1817 
letter to John Quincy Adams, a future presi- 
dent who in 1821 reported to the US govern- 
ment on the feasibility of metrication. 
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Adams favoured 
the first option; but 
his report to Congress 
plumped for the sec- 
ond, and recommended 
retaining the existing 
system. Metrication, 
said Adams, would 
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ROBERT P. CREASE surprisingly omitted by 
W. W. Norton: 2011. Crease, gave the same 
288 pp. £20 advice to the British 

Parliament in the 1820s. 

The most original section of the book 
concerns ancient China. Writing with 
the help of Chinese metrologists, Crease 
describes how the 12-note harmonic scale 
used in ritualized music helped to define 
the measurement system used at the impe- 
rial court. The ritual scale, known as the 
liilii, was supposedly devised by China’s first 
emperor, Huang Di, in the third millennium 
BC. He sent a minister to the mountains to 
procure bamboo ofa species revered for its 
regularity in length and thickness. From a 
piece 3.9 cun in length — the cun was the 
width of a thumb knuckle, or one-tenth ofa 
chi, the ‘Chinese foot’ — Huang Di made a 
one-note flute, whose pitch became the low- 
est in the scale, known as the huangzhong. 
Eleven more bamboo flutes created the liilii. 

Whatever the truth of this legend — the 
evidence suggests that the 12-note scale was 
actually introduced much later, some time 
before 400 Bc — 12 pitch regulators were 
made in cast metal; the lengths were speci- 
fied by regulation in chi, linking the basic 
unit of length with musical pitch. This system 
endured for more than 2,000 years and was 
not replaced until the 1920s, with the adop- 
tion of the metric system. In 1984, the country 
defined the chi to be one-third of a metre. 

Asa physicist, Crease is drawn to the drive 
for quantification, uniformity and precision 
in measurement — the main concern of his 
book. As a philosopher, he understands that 
there is more to quantification than these sci- 
entific virtues. The value of education and of 
scientific research, for example, cannot be 
measured wholly by examination results and 
citation indices. Nor can the fitting of clothes 
be entirely mechanized, as Crease concludes, 
after trying out various body scanners used 
by US retailers. Science cannot proceed 
without precise measurement, yet successful 
measuring systems cannot be divorced from 
everyday human dimensions. = 


Andrew Robinson is the author of The 
Story of Measurement. 
e-mail: andrew.robinson33@virgin.net 
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Books in brief 


A Strange Wilderness: The Lives of the Great Mathematicians 
Amir D. Aczel STERLING 304 pp. $24.95 (2011) 

A poet-mystic; a swordsman clad in green taffeta; a 12-year-old 

who mastered ancient Greek. Omar Khayyam, René Descartes and 
Gottfried Leibniz are just three of the mathematical greats in Amir 
Aczel’s trot through theorems and the lives behind them. Aczel, 
author of Fermat’s Last Theorem (1996), begins with the Greeks; 
ponders the geniuses of India, Arabia and China; frolics in the hotbed 
of the Italian Renaissance; examines the founders of calculus and 
the wunderkinder of the Napoleonic age; and skids to a halt with 
Alexander Grothendieck, who learnt maths in a Nazi internment camp. 


Death and Oil: A True Story of the Piper Alpha Disaster on the 
North Sea 

Brad Matsen PANTHEON 224 pp. $25.95 (2011) 

More than two decades before the Deepwater Horizon oil spill, 

the Piper Alpha oil rig exploded in the North Sea, killing 162 men. 
Writer Brad Matsen has packed in two years of research, during 
which he has interviewed survivors, managers, rescue teams and 
government officials. Matsen is thorough in laying out the scientific, 
technological, industrial and political context. This is a deftly told tale 
of human error, technological glitches and corporate reluctance that 
highlights the high cost of our thirst for crude. 


How We See the Sky: A Naked-Eye Tour of Day and Night 

Thomas Hockey UNIVERSITY OF CHICAGO PRESS 224 pp. $60 (2011) 
Images of the Horsehead Nebula from the Hubble Space Telescope 
are more familiar to most of us than the sight of the sky above our 
heads. So argues astronomer Thomas Hockey, who urges us to 
gaze unaided at the Universe. Starting with a scan of the horizon, 
Hockey takes us through the science as well as a host of cultural 
references, from Pink Floyd to the Pyramids of Giza in Egypt. He 
explores the astronomical sky, the 88 constellations and the Milky 
Way; orientation through azimuth to zenith; lunar and solar motion, 
solstices and eclipses. A heavenly and often humorous journey. 


The Physics Book: From the Big Bang to Quantum Resurrection, 
250 Milestones in the History of Physics 

Clifford A. Pickover STERLING 528 pp. $29.95 (2011) 

Molecular biophysicist and inventor Clifford Pickover follows 

his 2009 volume The Math Book with this energizing look at 

250 discoveries in physics. Bookended by the Big Bang and the 
‘quantum resurrection’, the landmark events run from Archimedes’ 
burning mirrors, Isaac Newton’s prism, the Higgs boson and the 
Doppler effect to dark energy, Wolfgang Pauli’s exclusion principle 
and rogue waves. Luminaries from Archimedes to Fritz Zwicky get 
their due, and it is gorgeously illustrated throughout. 


Great Discoveries in Medicine 

Edited by William Bynum and Helen Bynum THAMES & HUDSON 

304 pp. £24.95 (2011) 

Dazzling images adorn this crisply written chronicle of ‘eureka’ 
moments in medicine, covering our emergent knowledge of the 

body, diseases, drugs and surgery. A drawing of a Caesarean section 
in Hermann Friedrich Kilian’s nineteenth-century Geburtschiilflicher 
Atlas has the delicacy of Flemish Renaissance art. Other marvels 
include the first X-ray (of Mrs Rontgen’s ringed hand), a photograph of 
serotonin crystals and computer-generated images of viruses. 
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PALAEOANTHROPOLOGY 


Craniums with clout 


A look at two early human fossils reveals the prejudices in ideas about human 


evolution, finds Henry Gee. 


e have all seen the canonical 
parade of apes, each one becom- 
ing more human. We know that, 


as a depiction of evolution, this line-up is 
tosh. Yet we cling to it. Ideas of what human 
evolution ought to have been like still colour 
our debates. 

Palaeoanthropologist Dean Falk debunks 
some modern myths in her brilliant book, 
The Fossil Chronicles, by comparing the case 
histories of two famous fossils. A career 
spent teasing meaning from the brain casts 
of fossil hominins (creatures more closely 
related to Homo sapiens than to chimpan- 
zees) has led Falk into the debate on the 
cognitive abilities of Homo floresiensis. 
This dwarfed hominin — nicknamed the 
Hobbit — lived on the Indonesian island 
of Flores between approximately 95,000 
and 14,000 years ago, and was discovered 
in 2003 (see Nature 431, 1055-1061; 2004). 
Falk also describes Raymond Dart’s 1924 
discovery in South Africa of a juvenile skull 
of Australopithecus africanus, the Man-Ape 
of South Africa, and locates an unpublished 
manuscript by Dart on the find that chimes 


‘ : . with her views. 
Brains might Almost every time 

be small, but someone claims to 

they can still have found a new 


packapunch.” species of hominin, 
someone else refutes 
it. The species is said to be either a mem- 
ber of Homo sapiens, but pathological, or an 
ape. Brickbats of the first kind were levelled 
recently at H. floresiensis — that it wasn't 
a genuine species, but a modern human 
suffering from one of several kinds of micro- 
cephaly or from cretinism. But they had also 
been aimed at Neanderthal Man, discovered 
back in 1856, and thought by some to be the 
remains of a Mongolian Cossack from the 
Napoleonic wars. Accusations of apishness 
were aimed at A. africanus, described by 
Dart in these pages in 1925; and at Sahelan- 
thropus tchadensis, nicknamed Toumai, a 
very primitive putative hominin from Chad, 
discovered in 2001. 

Dart’s original paper on A. africanus was, it 
is true, long on waffle and short on substance. 
But the reason that this 


small-brained, pos- DNATURE.COM 
sibly erect-walking Formoreon Homo 
creature took two dec- __floresiensisskull 
ades to be accepted _ scans: 

as a hominin was _ go.tlafure.com/xzhhge 
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that researchers were 
in thrall to the idea 
that the expansion 
of the human brain 
came first, before the 
adoption of a fully 
erect gait. This precon- 


ception was supported 
by the discovery of the 
large-brained, ape- The Fossil 
jawed Piltdown Man Chronicles: How 
in 1912. The fact that it Two Controversial 
took 40 years to expose Discoveries 

; .. Changed Our 
Piltdown asafraudis View of Human 
amarkofhowdeeply Evolution 
rooted such prejudices DEAN FALK 
can be. University of California 


Press: 2011. 280 pp. 
$34.95 


Falk describes her 
work refuting the idea 
that the small brain of 
the Hobbit implies the creature might have 
had a congenital disorder of brain growth. 
She shows that its brain most resembled that 
of Homo erectus, another antique hominin, 
and was developed in areas associated with 
cognitive abilities that would have sup- 
ported making the simple tools with which 
its fossils are associated. Yet the Hobbit has 
a closer resemblance in its general anatomy 
to Australopithecus, suggesting — again con- 
trary to preconception — that homi- 
nins emerged from Africa much 
earlier than thought. 

The best parts of the book 
are those in which Falk traces 
the history of Dart. The 
Australian anatomist was 
exiled to South Africa by 
his mentor at University 
College London, Grafton 
Elliot Smith. In Africa, 
chance threw in his 
way the brain cast and 
skull of a juvenile 
hominin: a creature 
he named A. afri- 
canus. After pub- 
lishing his Nature 
paper, Dart was 
critically mauled by 
the London estab- 
lishment — nota- 
bly the ‘Piltdown 


Scans enabled researchers to model 
the brain shape of Homo floresiensis. 
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Committee’ who believed in the fake fossil — 
and he almost deserted palaeoanthropology, 
devoting his energies to building up capac- 
ity at the then-fledgling University of the 
Witwatersrand in Johannesburg. 

Almost, but not quite. Falk’s investiga- 
tion of Dart’s papers at Witwatersrand has 
brought to light a monograph on A. afri- 
canus that Dart never published. In 1929 
he sent it to Elliot Smith to submit to the 
Royal Society in London, but it was rejected, 
presumably on the basis of reports by the 
Piltdown Committee. Falk reveals that Dart 
had come to similar conclusions about the 
cognitive capacity of A. africanus as she has 
with H. floresiensis, providing circumstan- 
tial evidence for her link between Australo- 
pithecus and the Hobbit, and for an earlier 
African diaspora. Brains might be small, but 
they can still pack a punch. 

Falk’s book is worth reading just for the 
unearthing of this otherwise lost manu- 
script, vital to the history of palaeoanthro- 
pology. That it sparkles with scholarship and 
wit is icing on the cake. = 


Henry Gee is a Senior Editor of Nature. 


K. SMITH, MALLINCKRODT INSTITUTE OF RADIOLOGY 


K. C. ARMSTRONG/CORBIS 


Q&A Margaret Atwood 
Speculative realist 


Novelist Margaret Atwood’ essay collection In Other Worlds: SF and the Human Imagination, 
published this month, is a companion piece to her dystopian fictional world of global warming 
and engineered plagues. The Canadian author discusses where she gets her science, and her 
concerns for the future. 


Does science run in In Other Worlds: 
your family? SF and the Human 
Imagination 


My father was an 
entomologist — he 
studied sawflies, bud- 
worms and insects 
that eat trees, so asa 
child I spent a lot of time in the forest. My 
brother is a neuroscientist who studies syn- 
apses, one nephew is a physicist studying 
the composition of the Universe, another 
is a materials engineer studying crystal 
structure. My grades were a bit better in 
science than in English, so I easily could 
have become a biologist: I'd probably be 
cloning potatoes now, making them glow 
in the dark. But I started writing instead. 


MARGARET ATWOOD 
Nan A. Talese/Virago: 
2011. 272 pp. 
$24.95/£17.99 


You say in your new book that your novels 
are not science fiction, but speculative 
fiction. What’s the difference? 

It is hard to draw that line. A lot of what is 
labelled science fiction has nothing to do 
with science. It tends to be something that 
doesn't fit into any other genre, so it is all put 
in the same box. But to me there is a differ- 
ence between a science-fiction novel such as 
Ursula LeGuin’s The Left Hand of Darkness — 
which contains things that are very unlikely 
to happen, or impossible — anda speculative 
novel such as George Orwell’s 1984, which 
really could happen. My books are more like 
the latter — I don't write about Planet X. 


You also note that we’re preoccupied today 
with dystopias. Why is that? 

We're not feeling very hopeful about our 
future. In the nineteenth century, everybody 
thought they had a bright idea that would 
make life better. We wrote about utopias and 
model communities. The future was seen as 
a place of infinite advance. Then came the 
two World Wars and a number of totali- 
tarian societies that came in on a utopian 
ticket. The Soviet Union promised won- 
derful things and put on a good show, but 
meanwhile Stalin was starving Ukraine and 
butchering millions of people. We remember 
those experiences and know too much about 
them. It has become less and less possible to 
write a utopia that isn’t some form of Stepford 
Wives or Brave New World. 


What sort of future do you imagine in your 
books Oryx and Crake and The Year of the 
Flood? 
Genetic engineering is commonplace. A 
scientist named Crake designs a race of 
improved humans that are better adapted to 
their environment. They don’t have to wear 
clothes because they’ve got built-in sun 
block and insect repel- 


> NATURE.COM lent. They'll never 
For author Tom have to farm because 
Wolfe's take on they eat leaves. They're 
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no sexual jealousy. And they will drop dead 
at the age of 35, so they won't have age- 
related illnesses. To make room for them, 
Crake arranges to eliminate everybody else 
with a bioengineered epidemic. Having fun 
yet? However, not everybody is eliminated. 
Oryx and Crake is told from the point of view 
of one survivor. In The Year of the Flood, 
which tells a parallel story, we find that a few 
other people have also survived because they 
took precautions. 


How do you keep track of science? 

A number of scientists follow me on Twitter. 
They pass along reports of advances such as 
transplanting human brain cells into ani- 
mals, or making meat in the lab, or creating 
anew gene. Some of the things I wrote about 
in Oryx and Crake hadn't actually happened 
then, although you could see them com- 
ing and they have been done since. Other 
things that people thought I'd made up, like 
the goat-spider mix and the light-up rabbit, 
were already real. 


In Other Worlds cautions that, given the 
risks of biotechnology and cryogenics, “we 
should leave well enough alone”. Why? 
Humans will play with their toys until some- 
thing blows up. Once you let it out of the box, 
it is hard to put it backin. We now have the 
ability to create human-specific diseases to 
which nobody has any immunity and deploy 
them simultaneously all over the world. Cryo- 
genics, on the other hand, is a nonstarter: you 
get your head frozen, the money runs out, 
your relatives die, and you're cat food. 


Why does science scare some people? 
Science is attractive to those who like solv- 
ing puzzles. But it is not so appealing for 
people who want to be cuddled (or even 
reprimanded), who want to feel that things 
make sense, or that somebody’s looking after 
them. Scientists do not offer certainty, and 
they do not offer a universe that is centred 
around humans. Religions offer a world view 
in which you are important. 


Does the future worry you? 

I’m past the age when things scare me. But 
if I were younger, I would be looking down 
the line with some apprehension. A world 
with more than 9 billion people is not going 
to be very habitable. We've already used 90% 
of the fish in the sea. Global warming will 
make it worse: more droughts, more extreme 
weather and limited harvests. People think 
they will fix the problem with technology, 
but famine may fix it for us. Either way it 
will be a pretty miserable life. The infinite 
inventiveness of humans sometimes makes 
me feel hopeful, but we're just as capable of 
inventing horrible things as good things. = 


INTERVIEW BY JASCHA HOFFMAN 
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Reduce drug waste 
in the environment 


Environmental contamination 
by pharmaceuticals is reaching 
alarming levels (see, for example, 
Nature 476, 265; 2011) and is 
set to rise. New partnerships 
between drug companies, the 
public-health sector and those 
who deliver environmental 
sustainability are urgently 
needed to tackle the issue. 

Low-cost pharmaceuticals 
are increasingly accessible to 
the global population, which 
is predicted to exceed 8 billion 
by 2050. Rising drug use is also 
driven by ageing populations. 
Widely used preventative 
medication — such as statins 
and anti-hypertensives — and 
cheap generic drugs add to the 
problem. The UK Office of 
National Statistics predicts that 
the country’s medicine usage will 
more than double by 2050. 

Agricultural soils and rivers 
are contaminated with a range 
of pharmaceuticals, including 
antibiotics, antidepressants, 
analgesics and cancer- 
chemotherapy agents (see 
go.nature.com/Ir2vfy). 

The effects are already 
evident: they include the 
feminization of fish by residues 
of the contraceptive pill, and the 
deaths of millions of vultures 
on the Indian subcontinent 
following ingestion of the anti- 
inflammatory drug diclofenac. 
Antibiotic overuse has led to 
the emergence of resistant 
pathogenic bacteria in the wider 
environment, and not just in 
medical settings. 

Current practices remain 
unchanged. However, attempts 
are being made to provoke action. 
The European Environment 
Agency has recommended 
that improvements be made 
in pharmaceutical-waste 
management, and that more 
guidance be provided for the 
public and for policy-makers. 
The UK Royal Commission 
on Environmental Pollution 
in March highlighted links 
between demographic change 


and pharmaceutical releases, and 
the UK government's Advisory 
Committee on Hazardous 
Substances will conduct an 


investigation. 

Michael Depledge European 
Centre for Environment and 
Human Health, University of 
Exeter, UK. 
michael.depledge@pms.ac.uk 


Bridging the gender 
gap in UK science 


Sally Davies, the Chief Medical 
Officer for England, has broken 
new ground for gender equality 
in the sciences in a letter to the 
UK Medical Schools Council 
on 29 July. She will make it 
a requirement for academic 
departments applying for 
funding from the English 
National Institute for Health 
Research to hold the silver 
award of the Athena SWAN (for 
‘scientific women’s academic 
network’) Charter. We urge other 
funding bodies, including the 
UK research councils and the 
Royal Society, to follow suit. 

The charter recognizes good 
employment practice for women 
in UK science, engineering and 
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technology (SET). It is supported 
by the Equality Challenge Unit 
and the UK Resource Centre for 
Women in SET. 

The charter invites 
applications from UK 
universities and university- 
linked research institutes 
and departments to apply for 
bronze, silver or gold awards. 
These awards promote career 
development as well as gender 
equality. For example, they 
encourage improved mentoring 
and guide parents in how to 
partition their time between 
academic and family life. 

Departments can only 
apply for a silver award if their 
university already holds a bronze 
award. Ifa department does not 
get its silver award immediately, 
Athena SWAN advisers can 
recommend improvements. 
Davies's silver-award 
requirement will come into play 
for the next round of funding 
in four years’ time — a practical 
measure that gives universities 
and departments time to get 
their bronze and silver awards. 

The Athena SWAN 
Charter currently recognizes 
35 bronze universities, 11 
bronze departments, 40 silver 
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departments and one gold 
department (the University of 
York’s chemistry department). 
The next deadline for 
submissions is 30 November. 
Athene Donald University of 
Cambridge, UK. 

Paul H. Harvey, Angela R. 
McLean University of Oxford, UK. 
paul.harvey@zoo.ox.ac.uk 
Competing interests 

declared. See http://dx.doi. 
org/10.1038/478036b. 


Cloning advance calls 
for careful regulation 


In this issue, Scott Noggle 

and colleagues describe the 
generation of human pluripotent 
stem cells using somatic cells and 
human oocytes, a technique that 
bypasses ethical concerns about 
exploiting fertilized embryos for 
their medical potential (Nature 
478, 70-75; 2011). 

The cell lines were produced 
at the New York Stem Cell 
Foundation using private funds, 
in accordance with the Empire 
State Stem Cell Board’s policy 
of compensating egg providers 
for research. Unfortunately, 
many scientists will not have 
access to these cells, owing to 
regulations that prevent the 
publicly funded use of stem cells 
derived from research embryos 
and compensated egg donors. 
These policies — including 
those in place in California and 
at the US National Institutes of 
Health — are well intentioned, 
but possibly misguided. 

To generate their stem-cell 
lines, Noggle et al. use human 
oocytes in a new twist to the 
cloning technique known as 
somatic-cell nuclear transfer 
(SCNT). Societal fears about 
reproductive cloning should not 
force knee-jerk legislation to ban 
all forms of human SCNT. The 
use of this technique in research 
has a clearly regulated goal: to 
provide patient-specific stem- 
cell lines to help treat human 
disease, just like the almost 
universally supported research 
involving induced pluripotent 


stem cells, which are derived 
artificially from somatic cells. 
Regulatory policies for 
SCNT in different states and 
countries must be in agreement, 
and should adhere to the 
ethical guidelines for oocyte 
procurement issued by the 
International Society for Stem 
Cell Research. This will help 
to ensure that SCNT research 
proceeds with the requisite 
oversight and fair recruitment 
and compensation practices for 
oocyte providers (see also Nature 
442, 629-630; 2006). 
Insoo Hyun, Paul Tesar Case 
Western Reserve University, 
Cleveland, Ohio, USA. 
insoo.hyun@case.edu 
Competing financial interests 
declared. See http://dx.doi. 
org/10.1038/478036c. 


Energy should form 
its own discipline 


The international energy system 
needs an overhaul. The sector is 
multidisciplinary: it must serve 
modern civilization without 
compromising economic 
opportunity, undermining 
national security or impinging 
on the environment. Yet 
innovation today prioritizes 
improvements to discrete 
technologies and progress in 
single disciplines rather than 
rebuilding the whole system. 

A more joined-up approach 

is needed, beginning with 
education. 

Retooling the system will 
require a range of experts who 
understand new technologies 
and can translate them to the 
public, while considering the 
economic drivers necessary for 
their adoption. 

In the United States, for 
example, the educational 
framework for undergraduates 
does not always keep pace with 
advances in science, engineering 
and innovation. Even though 
energy is a leading international 
priority, it lacks definition in 
universities, where it is largely 
perceived as a professional 
pursuit, or as a subset of fields 
such as petroleum engineering. 
Often, students are exposed only 
to glimpses of the sector and 
do not acquire an integrated, 


systems-level perspective. 

Whereas institutions such as 
Duke University in Durham, 
North Carolina, the University 
of Texas at Austin and the 
University of British Columbia 
in Vancouver, Canada, have 
created programmes to address 
the changing energy landscape, 
none offers an interdisciplinary 
energy-focused degree at 
undergraduate and graduate 
levels. 

We propose that large energy 
departments should be set up 
at universities worldwide to tie 
seemingly disparate fields of 
knowledge together. Graduates 
could move between disciplines 
to promote ideas and work 
towards practical solutions. 
By fostering an open dialogue 
between specialists, this nascent 
labour force would then be well 
equipped to navigate through 
all of the technical, political and 
social issues related to energy. 
Sheril R. Kirshenbaum, 
Michael E. Webber University of 
Texas at Austin, Texas, USA. 
sheril.kirshenbaum@mail.utexas. 
edu 


Giant dam threatens 
Brazilian rainforest 


Brazil’s rainforest is under 
further threat from plans to 
build a giant hydroelectric dam 
on the Xingu River, a tributary 
of the Amazon River in Para 
state. Plans for the dam, known 
as Belo Monte, have been 
approved by the environment 
agency. These come on top of 
pending changes to the Brazilian 
Forest Code that could allow 
deforestation of up to 20 million 
hectares of rainforest (Nature 
476, 259-260; 2011). 

The US$17-billion dam, 
together with four planned 
upstream dams, will have 
a combined hydroelectric 
potential of 21,600 megawatts. 
Leaders of the Brazilian energy 
sector argue that the dams 
could help to preserve the 
Amazon. But their construction 
will flood vast areas of tropical 
rainforest, jeopardizing 
ecosystem functions and 
species survival, increasing 
greenhouse-gas emissions and 
displacing tens of thousands of 
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forest peoples. 

Brazil is a world leader in 
clean-energy production. 
However, the dams will release 
into the atmosphere enormous 
quantities of methane — a 
greenhouse gas that is 25 times 
more potent than carbon 
dioxide. 

Much of the electricity 
generated by Belo Monte is likely 
to be used in the production of 
aluminium ingots for export (see 
go.nature.com/latlx3), making 
the environmental and social 
impact of the dam’s construction 
even harder to justify. 

Brazil must strive to control 
deforestation more effectively 
by strengthening its forest 
laws and consolidating the 
United Nations’ REDD policy 
(for ‘reduced emissions from 
deforestation and forest 
degradation’). Otherwise, 
the steady destruction of the 
country’s tropical rainforest will 
have consequences well beyond 
its borders. 

Alison G. Nazareno Federal 
University of Santa Catarina, 
Florianopolis, Brazil. 
alison_nazareno@yahoo.com.br 
Thomas E. Lovejoy Heinz 
Center for Science, Economics and 
the Environment, Washington 
DG, USA. 


Pilot scheme for 
misconduct database 


Researchers, journal editors 
and scientific institutions 
should work together to 
improve communication about 
misconduct cases. Although 
published retractions are logged 
by PubMed and other databases, 
and by blogs such as Retraction 
Watch (http://retractionwatch. 
wordpress.com), the scientific 
community needs a way to 
identify flawed articles that have 
not been formally retracted but 
have been assessed as containing 
falsified data or having ethical 
problems (see, for example, 
Nature 476, 263-264; 2011). 

To this end, we have piloted 
an open database of publications 
for which misconduct has been 
established by committees (such 
as offices of research integrity 
within research institutions). 
The database is collaborative 


and is coupled to an online 
platform on which scientific 
integrity can be openly and 
constructively debated (see 
www.scientificredcards.org and 
T. Flutre et al. Eur. Sci. Ed. 36, 
51-52; 2010). 

The website focuses on the 
publications and not the authors, 
to avoid ‘naming and shaming. 
It has been legally validated 
by the French National 
Commission on Informatics 
and Liberty, so that such 
information can be made public 
while respecting privacy laws. 

To expand this initiative, the 
legal implications would have 
to be considered. It would need 
to be endorsed by the research 
community, which would 
cooperate to maintain and 
moderate it. Extensive publicity 
would be essential to ensure that 
the facility is used effectively. 

Our pilot project offers a 
route to reinforcing society's 
trust in science. Creating a 
public library of misconduct 
through a collaborative 
web platform is a timely, 
transparent and efficient way 
for the research community to 
communicate about possible 
scientific impropriety. 
Timothée Flutre University of 
Chicago, Illinois, USA. 

Thomas Julou Ecole Normale 
Supérieure, Paris, France. 

Livio Riboli-Sasco Paris 
Descartes University, France. 
Claire Ribrault Université Paris 
Diderot, France. 
contact@scientificredcards.org 


Discovery inspires 
those seeking tenure 


I strongly disagree with David 
Helfand’s view that tenure is 
the “social filter” that selects 
for professors who are “most 
attracted to lifetime security” 
(Nature 477, 158-159; 2011). 
For most young scientists 
attempting to scale the ladder of 
modern academia, I wager that 
the promise of scientific and 
intellectual discovery and the 
chance to inspire younger people 
are the real incentives. 
Barbara-Ann Lewis 
Northwestern University, 
Evanston, Illinois, USA. 
b-lewis@northwestern.edu 
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FORUM Stem cells 


Triple genomes go far 


A technique called somatic-cell nuclear transfer has been applied to human oocytes, resulting in the generation of 
personalized stem cells, albeit genetically abnormal ones. Two experts discuss the biomedical significance of this 
work and the ethical issues surrounding the use of human oocytes in research. SEE ARTICLE P.70 


THE PAPER IN BRIEF 

@ Somatic-cell nuclear transfer (SCNT) 
involves replacing the genome of an oocyte 
with that of an adult cell. 

@ Once the ‘reconstructed’ cell has 
developed into a blastocyst (a mass of 
70-100 cells), stem-cell lines can be derived. 
@ Human oocytes manipulated by SCNT do 
not develop to the blastocyst stage. 

@ To overcome this problem, Noggle 

etal.' (page 70) added the nucleus of a 
differentiated adult cell to an oocyte that still 
contained its nucleus (Fig. 1). 

@ This allowed growth to the blastocyst 
stage, but, undesirably, the resulting cells 


Imperfect yet 
striking 
GEORGE Q. DALEY 


oggle and colleagues’ study’ is noteworthy 

for generating the first — albeit geneti- 
cally abnormal — human pluripotent stem 
cells through oocyte-mediated reprogram- 
ming and for highlighting major technical 
barriers to SCNT using human eggs. 

Since the first isolation of human embryonic 
stem (ES) cells in 1998, a compelling strategy 
for the future envisaged exploiting SCNT to 
generate personalized embryonic stem cells. 
The aim has been to reprogram a patient's dif- 
ferentiated cells to pluripotency — the poten- 
tial to produce any tissue — and then to coax 
the resulting SCNT-ES cells to develop into 
disease-relevant cells, either for mechanistic 
studies or for combined gene and cell therapy’. 
Realistically, however, SCNT is a cumbersome 
process that cannot be readily scaled to allow 
widespread therapeutic use. 

One breakthrough was the discovery that 
skin cells can be reprogrammed to a pluri- 
potent state by enforced expression of only four 
transcription factors linked to pluripotency in 
ES cells’. The resulting induced pluripotent 
stem (iPS) cells, whether mouse or human, 


had three genome copies — one from the 
haploid oocyte and two from the diploid 
differentiated cell. 

@ Nonetheless, the adult genome copies 
reverted to gene-expression programs 
characteristic of embryonic stem cells. 

@ Moreover, the stem cells isolated from the 
blastocysts could differentiate into cells of all 
three germ layers, from which all the tissues 
and organs of the body develop. 

@ Noggle and colleagues paid women for 
their oocytes. 

@ There are significant legal and social 
concerns about obtaining human oocytes for 
research and even therapy. 


are functionally comparable to ES cells and 
provide an alternative to SCNT for generating 
personalized stem cells for disease modelling 
or cell-based therapies free of the problems of 
rejection. 

Despite enthusiasm for iPS cells, however, 
closer scrutiny of their genetic integrity and 
differentiation behaviour has revealed subtle 
yet potentially significant differences from 
ES cells. As well as provoking rogue genetic 
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changes, reprogramming can leave vestiges 
of the original differentiated (somatic) cell’s 
identity — known as epigenetic memory — 
through faulty remodelling of chemical modi- 
fications on DNA and its associated proteins”. 

Although it is premature to conclude that 
these foibles of iPS cells pose insurmountable 
risks, comparative studies of mouse stem cells 
suggest that SCNT may be more effective than 
forced expression of transcription factors in 
reprogramming cells to a pristine state of pluri- 
potency and erasing epigenetic memory”®. But 
until now, discussions of the relative merits of 
human SCNT-ES cells and iPS cells have been 
purely theoretical: although successful in non- 
human primates’, the generation of ES cells 
through SCNT has thus far failed in humans, 
largely because human oocytes have not been 
readily available for research. 

With the advantage of ready access to a 
large number (270) of donor oocytes, Noggle 
et al.' performed a rigorous exploration of 
SCNT and identified obstacles to the genera- 
tion of normal human blastocysts by this tech- 
nique. The researchers found that products of 
SCNT in humans stop dividing at the 6-10-cell 
stage, because removal of the oocyte genome 
apparently depletes the cell of factors that 
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Figure 1 | Three genomes are better than two? a, Typically, when the diploid nucleus of a differentiated 
adult human cell such as a skin fibroblast is transferred into a nucleus-free human oocyte, the resulting cell 
does not develop to the desired blastocyst stage. b, Noggle and colleagues’ show that leaving the haploid 
nucleus of the oocyte behind results in the generation of triploid cells that develop to the blastocyst stage. 
The authors isolated stem cells from these blastocysts (not shown) and found that the derived cells could 


differentiate into various cell types. 
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are essential for embryonic cell division 
or expression of genes from the somatic 
genome. Frustratingly, they could not over- 
come this cleavage arrest unless they left the 
oocyte genome in place; the cells they derived 
from the resulting blastocysts were therefore 
triploid somatic—oocyte pluripotent stem 
cells. Nonetheless, the authors’ sophisticated 
analysis revealed that the transplanted genome 
was fully reprogrammed, with no signs of 
epigenetic memory. Thus, although falling 
short of its ultimate goal, the paper’ stands asa 
stepping stone towards success, and raises the 
provocative question of how human SCNT-ES 
cells might perform relative to iPS cells. 


George Q. Daley is in the Stem Cell 
Transplantation Program, Division of 
Pediatric Hematology/Oncology, Howard 
Hughes Medical Institute, Children’s Hospital 
Boston, Boston, Massachusetts 02115, USA. 
e-mail: george.daley@childrens.harvard.edu 


Persons versus 
things 
JAN HELGE SOLBAKK 


hat are oocytes? What is their nature? 

What conceptual labels should be 
attached to such entities? What regulatory 
frameworks should be in place to regu- 
late their procurement for reproduction or 
research? And how should such transactions 
be acknowledged? These are some of the ques- 
tions that came to my mind when reading 
Noggle and colleagues’ paper’. 

Since the time of Roman law, legal thinking 
has operated with a fundamental distinction 
between person and thing. Even today, the 
entities subject to regulation are either per- 
sons or things, and there is no third option’. 
This conceptual lacuna continues to generate 
regulatory paradoxes in the health and life sci- 
ences, because many of the entities subject to 
regulation — including bodies, body parts, 
organs and tissues, and sperm and oocytes 
— cannot be considered either persons or 
mere things. 

How, then, should researchers proceed to 
procure oocytes? The approach Noggle et al. 
have taken is to pay 16 women for their oocytes 
and acknowledge their contribution as study 
participants. I believe this is a step in the right 
direction for three reasons: first, it transfers the 
focus from the entities procured to the subjects 
providing them; second, this refocusing avoids 
reducing the oocytes to mere things or com- 
modities open for transactions according to 
the rules of the market; and finally, the word 
‘participation’ paves the way for acknowledg- 
ing the women’s contribution as a piece of work 
for which they should be duly paid. 


The standard argument against paying 
gamete donors is that the contribution is only 
material — and therefore marginal — com- 
pared with that of the researchers involved. 
But whether a differential valuation between 
intellectual input and input of a material or 
manual kind is justified is questionable. As 
bioethicist Soren Holm wrote’: “In a future 
situation where there are many groups deriv- 
ing stem cells, and many donors providing 
embryos or gametes for the derivation, every- 
one’s contributions will be equally accidental 
and contingent...” If one group of accidental 
contributors (the researchers) is entitled to 
benefit financially from their contribution, 
why deny payment to another group of acci- 
dental contributors (the oocyte providers) for 
their work? 

Another argument against paying oocyte 
providers is that this would undermine the 
voluntary nature of the consent process and 
give an undue incentive to participate in such 
research’. This argument also seems to be 
based on questionable grounds, because the 
prospect of obtaining future financial ben- 
efits from participating in research may also 
represent a sort of undue inducement for the 
researchers. Besides, an indication that the 
women involved in the present study’ did not 
necessarily participate for financial gain is that 
they were all fully employed. 

The way Noggle et al.' have chosen to deal 
with the oocyte issue does not comply neatly 
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with existing regulatory guidelines in the field 
of stem-cell research. For this, in my view, they 
deserve praise rather than criticism, because 
their approach helps to draw attention to a 
possible way out of the regulatory quagmire 
resulting from reduction of oocyte providers to 
‘donors’ or ‘gift givers’ deserving merely com- 
pensation for their gifts. The authors’ approach 
represents the first step towards acknowl- 
edging women as genuine participants — 
co-producers even — in the generation of new 
knowledge. = 


Jan Helge Solbakk is in the Centre for 
Medical Ethics, Faculty of Medicine, 
University of Oslo, Box 1130, Blindern, 
0318 Oslo, Norway. 

e-mail: j.h.solbakk@medisin.uio.no 
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Homing in on 
another Earth 


The identification of the closest analogue of Earth so far, orbiting another star, 
suggests that small planets are common, and that the discovery of a candidate 
habitable planet in an alien star system could be just around the corner. 


JACOB BEAN 


than the Sun, astronomers’ primary objec- 

tive is to find a planet that is teeming with 
life. A milestone on the path to this goal is the 
discovery of Earth-sized planets orbiting their 
parent stars in the ‘habitable zone’ — the range 
of distances from the star at which the tem- 
perature would be just right for liquid water 
to be present on a planet’s surface. But which 
stars harbour such planets, how common are 
they, and what are their basic characteris- 
tics? Although astronomers can’t yet answer 
these questions, a paper to be published in 
Astronomy & Astrophysics by Pepe et al.' 
presents the discovery of several planets that 


IE the hunt for planets around stars other 


marks a significant step towards changing this 
impasse”. 

Pepe and colleagues detected five small 
planets orbiting parent stars that are slightly 
smaller and cooler than the Sun. One of the 
planets is only 3.6 times the mass of Earth and 
is in an orbit that teases the inner edge of its 
host star’s habitable zone. This is the closest that 
astronomers have yet come to finding another 
Earth. Furthermore, the relative ease with which 
these and other previously reported small plan- 
ets have been found by the same group implies 
that the frequency of such planets around Sun- 
like stars is on the order of tens of per cent. 

The authors made their new discoveries’ 


*This article was published online on 28 September 
2011. 
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are essential for embryonic cell division 
or expression of genes from the somatic 
genome. Frustratingly, they could not over- 
come this cleavage arrest unless they left the 
oocyte genome in place; the cells they derived 
from the resulting blastocysts were therefore 
triploid somatic—oocyte pluripotent stem 
cells. Nonetheless, the authors’ sophisticated 
analysis revealed that the transplanted genome 
was fully reprogrammed, with no signs of 
epigenetic memory. Thus, although falling 
short of its ultimate goal, the paper’ stands asa 
stepping stone towards success, and raises the 
provocative question of how human SCNT-ES 
cells might perform relative to iPS cells. 
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Earth. Furthermore, the relative ease with which 
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ets have been found by the same group implies 
that the frequency of such planets around Sun- 
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using the radial-velocity technique, which is an 
indirect method based on measuring the peri- 
odic change in speed ofa star caused by the grav- 
itational tug of orbiting bodies. Until this year, 
most planet discoveries were made with this 
method. But there has been a widely held opin- 
ion that the radial-velocity technique will not 
be able to find candidate habitable Earth-mass 
planets despite its success at finding Jupiter- 
mass planets. The reasoning behind this is that 
the variability of the visible surfaces of stars, as 
a result of magnetic activity, pulsations and the 
turbulence of the plasma in their atmospheres, 
creates noise in radial-velocity measurements, 
and this noise is larger than the signal induced 
by an Earth-sized planet in the habitable zone. 
Also, the technical challenge of building instru- 
ments sensitive enough to detect the subtle indi- 
cations of such planets, even around perfectly 
‘quiet’ stars, was considered too challenging. 

Pepe and colleagues’ detections' were 
enabled by a combination of exquisite instru- 
mentation, painstaking analysis and an 
ingenious observational technique. Using 
an ultra-stable instrument on a dedicated 
telescope, and with the benefit of ten years’ 
experience in refining their calibration strat- 
egy, the authors set out to intensively monitor 
ten of the quietest known nearby stars. Their 
monitoring campaign involved making mul- 
tiple measurements per night to average over 
the stellar noise cycles that had plagued pre- 
vious studies. Using this approach, they have 
broken through the previously inviolate one- 
metre-per-second radial-velocity barrier and 
detected planets that have signals less than ten- 
fold larger than that of an Earth analogue, for 
which the signal is 9 cms’. Pepe et al. were also 
careful to show that the detected signals were 
most probably attributable to orbiting planets 
rather than to intrinsic variations in the stars 
themselves. They did this by demonstrating 
that the orbital periods determined for the 
planets were distinct from the host stars’ 
rotation periods, and also that there were no 
correlations between the detected signals and 
the diagnostics of stellar activity. 

Despite being a clear breakthrough, there 
are some limitations to Pepe and colleagues’ 
study. One limitation is that the newly detected 
planets are poorly characterized. We do not 
know the compositions of the planets because 
the radial-velocity method yields no informa- 
tion about the densities of the planets it detects. 
Also, we have little knowledge of how elliptical 
the planets’ orbits are (their orbital eccentric- 
ity), because the detected signals are relatively 
small. Both composition and orbital eccen- 
tricity are crucial parameters for assessing a 
planet’s habitability. 

Furthermore, although Pepe and colleagues’ 
results’, and those described in another paper 
from the same group’, hint at a large popula- 
tion of small planets orbiting Sun-like stars, 
the statistics are weak because these planets are 
difficult to detect and therefore the samples are 


incomplete by an unknown amount. Identi- 
fying a large sample of similar planets, and 
studying them by different methods, would 
advance the field. However, Pepe et al. show 
that the full potential of the radial-velocity 
technique has not yet been realized. This work 
warrants the building ofa generation of radial- 
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Kepler mission, 
which finds planets 
using the transit technique, announced a haul 
of more than 1,000 new planet candidates 
earlier this year, and is on track to identify 
habitable-zone planets’. Because the transit 
technique detects planets by measuring peri- 
odic decreases in a star’s brightness when a 
planet passes in front of it, the issue of stel- 
lar variability is also a limiting factor for this 
approach, and early results indicate that the 
stars Kepler is looking at are significantly 
more variable than expected*. This means that 
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an extension beyond the nominal 3.5-year 
mission duration might be necessary for Kepler 
to securely detect Earth-sized planets in the 
habitable zones of Sun-like stars. 

In the long run, astronomers aim to study 
the atmospheres of the small worlds revealed 
by radial-velocity and transit surveys to obtain 
further insight into the planets’ habitability and 
even, perhaps, their state of inhabitance. The 
investigations that will be enabled by NASA’s 
planned James Webb Space Telescope, which is 
currently at risk of cancellation, are the corner- 
stone of astronomers’ next plans in this area. 
Nevertheless, it is clear that the search for other 
Earths is gathering pace. The exciting results 
from Kepler, and the remarkable advances in 
the radial-velocity technique demonstrated 
by Pepe et al., show that the race is well and 
truly on. m 
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Defence against 
oxidative damage 


Macular degeneration is a leading cause of blindness in the elderly in the 
developed world. Hope for prevention and treatment comes from the discovery 
of a protective mechanism against oxidative damage to the eye. SEE LETTER P.76 


FERNANDO CRUZ-GUILLOTY & 
VICTOR L. PEREZ 


angerous oxygen radicals that are 
D sometimes generated during metabolic 

processes can damage cellular com- 
ponents. The eye, with its constant exposure 
to light and its high metabolic rate, is particu- 
larly susceptible to such damage, or oxidative 
stress. If left unchecked, this can be cumula- 
tive, leading to age-related macular degenera- 
tion. Almost two-thirds of people over the age 
of 80 have this condition, and between 30 mil- 
lion and 50 million individuals are affected 
worldwide, with a frequency in industrialized 
countries similar to that of cancer’. On page 76 
of this issue, Weismann et al.” describe how a 
protein normally associated with an immune 
pathway also protects against inflammation 
induced by oxidative stress in age-related 
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macular degeneration. By combining in vitro 
and in vivo data from human patients and ani- 
mal models, the authors provide a plausible 
explanation for the cause of this devastating 
chronic disease. 

The macula region of the retina is required 
for central vision and is heavily populated with 
photoreceptors. These convert the light enter- 
ing the eye into electrical and molecular signals 
that are transmitted to the brain for visual pro- 
cessing. The retina’s outer segments are replen- 
ished daily, and the resulting debris is cleared 
away by the retinal pigment epithelial cells. If 
these cells become dysfunctional, a build-up 
of debris (drusen) can occur in the vicinity of 
photoreceptor cells in the macula, which are 
then more likely to die off, leading to the irre- 
versible loss of vision seen in patients with age- 
related macular degeneration (AMD). Many 
environmental and genetic factors have been 
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away by the retinal pigment epithelial cells. If 
these cells become dysfunctional, a build-up 
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versible loss of vision seen in patients with age- 
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Figure 1 | Mechanisms at work in age-related 
macular degeneration. Oxidative damage to 
lipids in the cell membrane generates the reactive 
decomposition product malondialdehyde (MDA), 
which forms adducts with cell proteins. Normal 
complement factor H (CFH) binds MDA with 
high affinity, blocking inflammatory reactions. 
Mutant CFH, in which the amino acid histidine 

is substituted for tyrosine at residue 402, fails to 
bind MDA, so inflammation cannot be prevented, 
leading to age-related macular degeneration 
(AMD) and blindness. 


correlated experimentally with AMD progres- 
sion, including oxidative damage (induced by 
factors such as smoking and light exposure) 
and inflammation’. 

A collection of proteins known as comple- 
ment factors form part of the innate immune 
system, which is the first line of defence 
against pathogens. These factors interact with 
one another in a sequence of stimulatory or 
inhibitory steps in a cascade known as the 
complement pathway. Complement proteins 
have been implicated in certain pathologi- 
cal conditions, and have been found in the 
accumulated drusen of patients with AMD*. 
Variations in the DNA sequence at particu- 
lar sites, or polymorphisms, in genes encod- 
ing complement factors have been associated 
with the development of AMD, suggesting 
that inflammation is an important component 
of the disease. 

A polymorphism in complement factor H 
(CFH) conveys a significant risk of developing 
AMD*”. CFH is an inhibitor of the comple- 
ment pathway and therefore has anti-inflam- 
matory activity. The relevant single-nucleotide 
polymorphism in the CFH gene produces an 
amino-acid change from tyrosine to histidine 
at position 402 (the Y402H mutation) in the 
protein. The functional consequences of this 
mutation have been elusive until now, but 
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Weismann et al.” convincingly show that it 
directly affects the ability of CFH to control 
the inflammation associated with AMD. 

The story began with the group’s interest 
in malondialdehyde (MDA) — a common 
decomposition product of lipid peroxidation 
by oxygen radicals. It reacts with cellular pro- 
teins to form adducts that can act as markers of 
oxidative stress. The MDA-modified proteins 
induce inflammatory responses and are recog- 
nized by the innate immune system. They are 
found in many physiological and pathological 
conditions, including atherosclerosis, AMD*” 
and other chronic degenerative diseases. 

Weismann et al. show that CFH peptides 
constitute the majority of MDA-binding 
proteins. A series of cleverly designed experi- 
ments clarified the physical and functional 
features of the CFH-MDA interaction. This 
turned out to be highly specific, with CFH 
binding to MDA whatever its carrier pro- 
tein, but not to other oxidative products. 
The authors mapped the CFH domains that 
specified MDA binding, including a short 
segment known as SCR7, which contains 
the Y402H mutation. The Y402H CFH vari- 
ant from AMD patients showed a markedly 
reduced ability to bind MDA compared with 
normal CFH (Fig. 1). 

Weismann and colleagues further demon- 
strate that CFH and MDA co-localize in the 
eyes irrespective of whether these organs are 
affected by AMD. This suggests that in the 
healthy eye CFH protects the macula, and in 
dying (apoptotic) cells it recognizes MDA- 
protein adducts. Perhaps the researchers’ most 
important result from a therapeutic viewpoint 
is that CFH can prevent MDA-mediated pro- 
inflammatory effects in at least two cell types 
associated with AMD — retinal pigment 
epithelial cells and macrophages. 

Weismann and co-workers’ findings” 
answer the long-standing question about 
the role of CFH in AMD but they also raise 
other questions. MDA is ubiquitously gener- 
ated in a variety of inflammatory settings, but 
we don't know whether its connection with 
CFH is relevant outside the eye. The authors 
found that other members of the CFH family 
that bind MDA block CFH activity, suggest- 
ing that the subtle regulation of complement 
activity needs to be examined in more detail. 
It will be interesting to see whether other 
oxidation-induced modifications associ- 
ated with AMD, including those induced by 
carboxyethylpyrrole”’, interact with proteins 
in a way similar to the MDA-CFH para- 
digm. Answers to such questions could help 
in the fight against AMD and other chronic 
inflammatory diseases. m 
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HOON 


Malapa and the 


genus Homo 


Two remarkably well-preserved skeletons of the hominin species 


Australopithecus sediba, found at Malapa, 


South Africa, show an intriguing 


combination of features, and open up a debate about the origins of the genus Homo. 


FRED SPOOR 


last year by Berger et al.’ of the remains 

of a newly discovered hominin species, 
Australopithecus sediba, the same group has 
now published five reports*® in Science 
detailing additional fossils and further analyses. 
Cave deposits at the Malapa site in South Africa 
yielded two partial skeletons, which Pickering 
et al.° have found to be 1.977 + 0.002 mil- 
lion years (Myr) old. These skeletons 
are not only well preserved and 
remarkably complete, but also show 0) 
a surprising mix of morphological 
characters. Given the completeness 
of the skeletons, the unexpected 
combination of primitive and derived 
morphology, and the likelihood that 
further individuals will be recovered 


Nasties: on from the announcement 


Millions of years ago 


at Malapa, A. sediba certainly has 2 
the potential to uproot conventional 
views of human evolution. 

Overall, the authors find that A. 


sediba is australopith-like, with a 
small brain and long arms, and is 
most similar to its likely ancestor 
Australopithecus africanus, remains 
of which have been found at several 
South African sites. However, some 
aspects of the A. sediba skeletons 
seem to show a closer resemblance 
to the morphology found in species 
of the genus Homo. These include 
aspects of the shape of the pelvis’ 
and ankle joint’, as well as the long 
thumb and short fingers that are 
characteristic of hands capable of 
precise manipulation*. The authors 
suggest that these features are 
phylogenetically shared with Homo 
species, rather than being examples 
of homoplasy (similar traits that 


evolved independently in separate lineages), 
and conclude that A. sediba is a plausible 
candidate ancestor of Homo. 

Early species of the genus Homo — H. habilis, 
H. rudolfensis and H. erectus — appear in the 
fossil record about 1.9 Myr ago (Fig. 1). Of 
these, H. erectus stands out because it gave 
rise to later Homo species (including mod- 
ern humans), dispersed out of Africa, and 
became extinct less than half a million years 
ago. Homo habilis, or a species similar to it, 


Later Homo 
H. erectus 


H. habilis 


H. rudolfensis 


Homo sp. ~ ~~~ ~~ 


A. africanus 


Figure 1 | Temporal distribution of selected hominin species. The 
bar diagram shows when various hominins (two australopiths, red, and 
various Homo species, blue) appear in the fossil record. The pale blue bar 
represents fragmentary fossils that are generally thought to come from 
early Homo. Of these, an upper jawbone from Hadar, Ethiopia (black 
line on the pale blue bar), is well dated at 2.35 million years (Myr) old, 
and is the most convincingly Homo-like. Lines connecting bars indicate 
hypothetical ancestry between species. The most recent addition to the 
diagram is Australopithecus sediba — two skeletons” of the hominin 
found at Malapa, South Africa, are 1.977 + 0.002 Myr old. Two scenarios 
have been proposed” in which A. sediba is the ancestor of the genus 
Homo. In the first scenario’, fossils at Malapa come from a late-surviving 
population of A. sediba, whose earlier representatives (pink bar) were 
ancestral to Homo (dashed line). In the second scenario’, the A. sediba 
population at Malapa was itself ancestral to early Homo (dotted line), 
which means that fossils pre-dating 2 Myr ago (pale blue) cannot be 
attributed to Homo. 
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A. sediba 


is commonly considered to have been the 
ancestor of H. erectus, but it is difficult to be 
sure of this because only a small number of 
fragmentary fossils older than 1.9 Myr have 
been attributed to Homo, and these could not 
be attributed definitively to a specific species. 
The fossil most secure in its affinities and 
provenance is the approximately 2.35-Myr-old 
upper jawbone from Hadar, Ethiopia’, which is 
more Homo-like than that of A. sediba and pre- 
dates the Malapa finds by some 370,000 years. 
This evidence seems at odds with the idea that 
A. sediba was involved in the first appearance 
of Homo. 

In their original publication’, Berger et al. 
suggested that A. sediba could have origi- 
nated much earlier than the time to which the 
remains were dated, with the Malapa sam- 
ple representing a late-surviving population 
(Fig. 1). However, in their latest report®, the 
authors go further by concluding that even 
A. sediba fossils as late as those preserved at 
Malapa could have been the ancestor of Homo. 
As a logical corollary, they also contest the 
Homo affinities of any fossil older than 2.0 Myr 
old. What’s more, the authors hint at the pos- 
sibility that A. sediba itself, rather than a spe- 
cies such as H. habilis or H. rudolfensis, was 
ancestral to H. erectus. It will, however, be dif- 
ficult to uphold the suggestion that the exten- 
sive evolutionary change required could have 
occurred in the time available (a maximum of 
80,000 years) if A. sediba at Malapa gave rise 
to Homo species. Moreover, the idea that no 
fossil older than 2.0 Myr is legiti- 
mately attributable to Homo is highly 
debatable — the arguments provided 
in the paper are insufficiently specific 
to be conclusive, particularly with 
respect to the Hadar jawbone. 

Another question is whether the 
authors’ morphological analyses*® 
do indeed suggest that A. sediba 
has closer evolutionary links with 
H. erectus than do H. habilis or 
H. rudolfensis. The answer seems to 
be no, mainly because the required 
comparisons either were not made 
or cannot be made in the absence 
of fossil evidence. For example, 
the morphology of the entire post- 
cranial skeleton of H. rudolfensis is 
unknown, as is that of the pelvis of 
H. habilis, and very few hand and 
foot bones of H. erectus have been 
recovered, which means that none 
of these bones can be compared with 
those of A. sediba. Conversely, sev- 
eral brain endocasts — casts of the 
inside of fossil braincases — of spe- 
cies of early Homo are available®, but 
the authors compared’ their endo- 
cast of A. sediba only with those of 
modern humans and chimpanzees, 
and with two A. africanus fossils. 

The rear and base of the A. sediba 
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sibility that A. sediba itself, rather than a spe- 
cies such as H. habilis or H. rudolfensis, was 
ancestral to H. erectus. It will, however, be dif- 
ficult to uphold the suggestion that the exten- 
sive evolutionary change required could have 
occurred in the time available (a maximum of 
80,000 years) if A. sediba at Malapa gave rise 
to Homo species. Moreover, the idea that no 
fossil older than 2.0 Myr is legiti- 
mately attributable to Homo is highly 
debatable — the arguments provided 
in the paper are insufficiently specific 
to be conclusive, particularly with 
respect to the Hadar jawbone. 

Another question is whether the 
authors’ morphological analyses*® 
do indeed suggest that A. sediba 
has closer evolutionary links with 
H. erectus than do H. habilis or 
H. rudolfensis. The answer seems to 
be no, mainly because the required 
comparisons either were not made 
or cannot be made in the absence 
of fossil evidence. For example, 
the morphology of the entire post- 
cranial skeleton of H. rudolfensis is 
unknown, as is that of the pelvis of 
H. habilis, and very few hand and 
foot bones of H. erectus have been 
recovered, which means that none 
of these bones can be compared with 
those of A. sediba. Conversely, sev- 
eral brain endocasts — casts of the 
inside of fossil braincases — of spe- 
cies of early Homo are available®, but 
the authors compared’ their endo- 
cast of A. sediba only with those of 
modern humans and chimpanzees, 
and with two A. africanus fossils. 

The rear and base of the A. sediba 


cranium are not preserved, which is unfor- 
tunate as these areas are highly diagnostic of 
H. erectus. However, Berger et al.' have argued 
that the two species share two characteristic 
features in other parts of the cranium that 
are not present in H. habilis or H. rudolfensis. 
One feature is a slightly swollen area under 
the eye socket, but neither the definition nor 
the expression of this character in hominins 
is well established. The other feature is the 
amount of constriction of the braincase relative 
to the breadth of the face: both A. sediba and 
H. erectus show less constriction than other 
early hominin species. 

At first sight, this shared characteristic does 
seem to be convincing evidence supporting a 
link between A. sediba and H. erectus. How- 
ever, the constriction of an individual’s brain- 
case changes significantly, late in development; 
Berger and colleagues’ studied a juvenile 
A. sediba that would have developed greater 
constriction had it lived to adulthood. Further- 
more, early H. erectus shows greater constric- 
tion than do geologically later forms, and in 
this respect is similar to H. habilis. When all of 
these factors are taken into account, the exclu- 
sive grouping of A. sediba with H. erectus no 
longer seems clear. 

Taken together, the published evidence'® 
indicates that A. sediba is a late australopith 
that has several intriguing Homo-like fea- 
tures. If these features do indeed associate 
A. sediba with the emergence of Homo, rather 
than reflecting homoplasy, then it seems that 
the scenario in which the Malapa specimens 
represent a late surviving population’ is the 
most plausible explanation for Berger and 
colleagues’ findings. 

Many reviews of palaeontological research 
end with the statement that it would be highly 
desirable to recover more fossils. In this 
case, however, the Malapa team has already 
done that. The interpretation of their find- 
ings may be a matter of debate, but they have 
undoubtedly added a spectacular and thought- 
provoking sample to the hominin fossil record. 
This achievement represents a major contribu- 
tion to the study of human evolution in all its 
complexity. = 
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Self-aware particles 


The signature of the self-interactions that a colloid in solution undergoes has 
been observed. The observation has implications for single-particle studies of 
soft matter and biological systems. SEE LETTER P.85 


ULRICH F. KEYSER 


ave you ever floated a small ball on the 
H surface of a body of water? If you have, 
youll probably know that, no matter 
how careful you are, the ball usually moves 
because of the waves created by its impact 
on the water surface. The ball moves around 
until all of the waves’ energy has dissipated 
and the water becomes still. During this pro- 
cess, the ball interacts with the water and with 
itself. This striking macroscopic phenomenon 
should be valid for all situations in which an 
object moves in an incompressible medium. 
Any particle completely immersed in a liquid 
should also strongly self-interact. However, 
both models of and experiments on particle 
fluctuations on micrometre-length scales often 
dismiss this hydrodynamic self-interaction — 
which can be justified if water motion stops on 
timescales that are experimentally inaccessible. 
However, on page 85 of this issue, Franosch 
et al.' show that it is possible to detect the 
hydrodynamic self-interaction of a particle in 
solution undergoing Brownian motion. 
Franosch and colleagues monitored 


Colloid 


Laser 


the Brownian motion of a single colloid 
(a micrometre-sized sphere) in liquid held in 
a single-beam optical trap (Fig. 1a). Optical 
traps can confine a single colloid to a region 
spanning just a few tens of nanometres in all 
three dimensions”. They are easily created by 
tightly focusing a laser beam, which essentially 
transforms the colloid in the trap into a har- 
monic oscillator. The colloid’s Brownian fluc- 
tuations, which are caused by the impact of the 
surrounding solvent molecules, can then be 
monitored (Fig. 1b). In the classical ‘over- 
damped’ regime, one would expect the fluctua- 
tions to be uncorrelated with one another and 
hence to display a white power spectral density 
— that is, the magnitude (power) of the fluctu- 
ations as a function of frequency is constant at 
small frequencies (Fig. 1c). However, the power 
spectral density should change when the col- 
loid’s hydrodynamic self-interactions become 
relevant and, as a result, the fluctuations 
are correlated in time. 

To observe this effect, Franosch and col- 
leagues’ tuned two timescales that govern 
colloidal motion in the optical trap. The 
first is the relaxation time of the harmonic 
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Figure 1 | Coloured noise. a, Franosch et al.' immersed a single colloid in a liquid and trapped it 

in the focal spot ofa laser. b, They then monitored the colloid’s Brownian fluctuations in position 

as a function of time. c, The resulting dimensionless power spectral density — the magnitude of the 
fluctuations as a function of frequency (inverse of time) — displayed an increase (coloured noise; red) 
due to hydrodynamic self-interactions; the classical ‘overdamped’ regime of oscillations (grey) 


lacks this increase. 
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colleagues’ findings. 
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end with the statement that it would be highly 
desirable to recover more fossils. In this 
case, however, the Malapa team has already 
done that. The interpretation of their find- 
ings may be a matter of debate, but they have 
undoubtedly added a spectacular and thought- 
provoking sample to the hominin fossil record. 
This achievement represents a major contribu- 
tion to the study of human evolution in all its 
complexity. = 
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surrounding solvent molecules, can then be 
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Figure 1 | Coloured noise. a, Franosch et al.' immersed a single colloid in a liquid and trapped it 

in the focal spot ofa laser. b, They then monitored the colloid’s Brownian fluctuations in position 
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oscillator (the time it takes the colloid to reach its 
equilibrium position), which is inversely 
proportional to the trap stiffness, or strength. 
Because trap stiffness increases with laser 
power, the authors could easily tune the relaxa- 
tion time. The second tuned timescale is the 
diffusion time — the time taken by the fluid to 
diffuse over the diameter of the particle. This 
time can be controlled by varying the particle's 
diameter and the fluid’s viscosity. To optimize 
their measurement conditions, Franosch 
et al. used colloids made of melamine resin 
with diameters of around 3 micrometres and 
immersed them in acetone. 

By using high time resolution in particle 
detection and high laser power, the authors 
managed to decrease the relaxation time 
to just six times that of the faster fluid diffusion 
time. This allowed them to observe correlated 
fluctuations due to the colloid’s hydrodynamic 
self-interaction. A direct consequence of 
this hitherto undetected correlated behaviour 
is an increase (resonance) in the power spec- 
tral density at frequencies close to the inverse 
of the relaxation time. Franosch et al. show 
that the resulting power spectral density is not 
white but ‘coloured; because not all frequen- 
cies have the same magnitude (Fig. 1c). Inter- 
estingly, as shown by their computer models, 
the observed increase in the power spectral 
density depends on a particle’s shape and 
surface properties. 

Franosch and colleagues’ study has several 
ramifications for the interpretation of single- 
particle studies of soft matter. Hydrodynamic 
coupling between colloids and macromolecules 
is well known to play a part in multi-particle 
systems because of the long-range nature of 
the interaction. For instance, it can be used 
to explain the coordinated motion of pairs of 
flagella in Chlamydomonas algae’. Efforts are 
also under way to investigate the coupling 
of several colloids with a view to creating an 
artificial and autonomous swimmer’. In addi- 
tion to their relevance on micrometre-length 
scales, hydrodynamic interactions are crucial 
for understanding phenomena such as the 
transport of DNA through nanopores under 
the influence of an electric field’. 

The results’ discussed here are remark- 
able. For the first time, the self-interaction 
between a single colloid and its surrounding 
medium is conclusively demonstrated. From 
these experiments, it seems clear that a single 
particle in solution is aware of its own pres- 
ence. Such awareness fundamentally changes 
the particle’s thermal fluctuations, which are 
mediated by hydrodynamic self-interactions. 
These findings highlight the fact that hydro- 
dynamic phenomena, often dismissed at the 
micrometre and nanometre scales in viscous 
solutions, have to be taken into account 
and can even be exploited for new sensing 
applications. One obvious, truly label-free 
sensing application could be to distinguish 
shape, changes in diameter and surface 


morphology of small particles — or even 
living cells — in optical traps by monitoring 
the shape of the power spectral density. m 
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A hidden ancestral 
legacy trumped 


A previously unsuspected genetic mechanism underlies a type of muscular 
dystrophy common in Japan. A therapeutic approach based on this finding and 
tested in mice has come up with encouraging results. SEE LETTER P.127 


MASAYUKI NAKAMORI & CHARLES THORNTON 


he symptoms of a genetic disorder 
known as Fukuyama-type congenital 
muscular dystrophy start in infancy and 
lead to severe disability and premature death in 
childhood or adolescence’. It is one of the most 
common recessive diseases in Japan and results 
from disruption of the gene encoding a pro- 
tein called fukutin. On page 127 of this issue, 
Taniguchi-Ikeda et al.’ identify and correct the 


a_ Healthy individuals 


Exon 


1 2 


Fukutin 
gene 


Normal fukutin protein 


molecular steps that result in the expression 
of faulty fukutin by the damaged gene. 

About 100 generations ago, a segment of 
DNA in a Japanese ancestor was copied from 
one genomic site and reinserted into another, 
landing in the fukutin gene’. Genetic events 
of this kind, in which DNA segments called 
retrotransposons are copied and pasted in 
many places in the same genome, are not 
rare, occurring with an estimated frequency 
of roughly one per 20 births*. The immediate 
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Figure 1 | Abnormal splicing of the fukutin gene. a, The normal fukutin gene contains an inaccessible 
splice donor site within its exon 10 (green asterisk). The resulting messenger RNA encodes the normal 
fukutin protein. ‘Start’ and ‘Stop’ indicate the beginning and the end of the mRNA sequence encoding the 
protein. b, In patients with Fukuyama-type congenital muscular dystrophy (FCMD), ancestral insertion 
of a retrotransposon within the final exon activates this inaccessible splice donor site and creates a new 
splice acceptor site (red star) in the retrotransposon sequence. Such ‘exon trapping’ results in incorrect 
splicing of the mRNA and — following translation — modification of the carboxy terminus of the fukutin 
protein and impaired glycosylation of a-dystroglycan. Taniguchi-Ikeda et al.’ correct this by using splice- 
blocking antisense oligonucleotides (not shown) to restore expression of normal fukutin. 
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descendants of this particular event probably 
suffered no harm, because disrupting one copy 
of fukutin has no adverse effects. 

The consequences for modern Japan, 
however, are quite different. Around 1 in 90 
Japanese individuals now carry this same 
retrotransposon insertion, being descended 
from the unknown ancestor. The children of 
two such individuals are at risk of inheriting 
two disrupted copies, which results in loss of 
fukutin activity. Although the exact function 
of fukutin is unknown, it is clearly involved in 
the attachment of carbohydrate molecules to 
the a-dystroglycan protein®®. This protein is 
anchored to the cell surface and, when prop- 
erly modified by carbohydrates through a 
process called glycosylation, it forms a crucial 
link between the intracellular cytoskeleton 
and the extracellular matrix. In the absence 
of fukutin, glycosylation is incomplete, and 
the link is broken. This causes abnormal neu- 
ronal migration during development, mental 
retardation and progressive degeneration of 
muscle cells. 

Because the consequences of Fukuyama- 
type congenital muscular dystrophy (FCMD) 
are so devastating, the hunt for the disease 
gene was naturally undertaken with hopes that 
identification of the mutation would point the 
way to developing a treatment. But short of 
using gene therapy to restore a normal copy of 
the fukutin gene, its discovery’ had no imme- 
diate therapeutic implications. The gene’s 
discoverers continued to pursue the problem, 
however, and 13 years later they have found a 
potential opening. 

First, they correct a misconception about 
how the retrotransposon affected fukutin 
expression. Initial studies had indicated that 
the insertion caused a near-complete absence 
of fukutin messenger RNA, which fitted obser- 
vations that retrotransposons can silence 
gene expression. Taniguchi-Ikeda et al.” re- 
examine the problem and find that, although 
part of the transcript is missing, the overall 
amount of fukutin mRNA is not appreciably 
reduced. However, they notice that splicing 
of the fukutin transcript, a process in which 
different parts of the primary transcript 
are joined to create the mature mRNA, is 
dramatically affected. 

The authors found that the effects on fuku- 
tin splicing and function in mice were very 
similar when they artificially inserted the same 
retrotransposon at the identical location in the 
mouse fukutin gene. They showed thata splice 
‘donor’ site in the final exon (protein-coding 
region) that had previously been inaccessible 
was activated and became joined to a newly 
created splice ‘acceptor site in the retrotrans- 
poson sequence, a process known as exon trap- 
ping (Fig. 1). Retrotransposon insertions are 
known to cause exon trapping’, but this is the 
first example to show a clear association with 
disease. Because of this splicing alteration, the 
carboxy terminus of fukutin is eliminated and 


replaced instead with amino acids encoded by 
the retrotransposon sequence. Exactly how 
this error compromises the glycosylation of 
a-dystroglycan is unclear, but it may be that the 
mutant fukutin protein is routed to the wrong 
cellular compartment’. 

To test whether normal fukutin expression 
could be restored by correcting the abnormal 
splicing, the researchers designed ‘antisense’ 
oligonucleotides to suppress exon trapping. 
These molecules are short DNA-like fragments 
that bind, according to the rules of nucleic-acid 
hybridization, to the fukutin transcript before 
it is spliced, thereby favourably altering the 
outcome of the splicing process. This approach 
had been used previously to suppress or shift 
splicing sites in other disease states, including 
other forms of muscular dystrophy””*. In cells 
derived from patients with FCMD, the anti- 
sense oligonucleotides had the intended effect 
of blocking the deleterious splicing event. As 
predicted, this led to re-expression of normal 
fukutin protein and re-establishment of the 
link between a-dystroglycan and extracellular- 
matrix proteins. Injecting these oligonucleo- 
tides into mice carrying the retrotransposon 
insertion partially restored normal fukutin 
protein in muscle tissue, again with improved 
a-dystroglycan glycosylation. 

Could this strategy be adopted to treat 
children with FCMD? Possibly. But rescu- 
ing the associated brain malformation would 
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presumably require treatment in utero, a dif- 
ficult undertaking. The major challenge to 
the use of antisense oligonucleotides for splice 
blocking is distributing them into cells in 
sufficient quantity to influence splicing pro- 
cesses. Although progress has been made in 
addressing this problem”, a general solution, 
applicable for brain and muscle tissue, is not 
yet available. It is also possible that a similar 
approach could be applied to other genetic 
disorders associated with retrotransposon 
insertion and exon trapping. = 


Masayuki Nakamori and Charles Thornton 
are in the Department of Neurology, University 
of Rochester Medical Center, Rochester, 

New York 14642, USA. 

e-mail: charles_thornton@urmc.rochester.edu 


1. Fukuyama, Y., Osawa, M. & Suzuki, H. Brain Dev. 3, 


1-29 (1981). 

2. Taniguchi-lkeda, M. et al. Nature 478, 127-131 
(2011). 

3. Toda, T. & Kobayashi, K. J. Mol. Med. 77, 816-823 
(1999). 


4. Xing, J. et al. Genome Res. 19, 1516-1526 (2009). 

5. Hayashi, Y. K. et al. Neurology 57, 115-121 (2001). 

6. Michele, D. E. et al. Nature 418, 417-422 (2002). 

7. Kobayashi, K. et al. Nature 394, 388-392 (1998). 

8. Hancks, D.C., Ewing, A. D., Chen, J. E., Tokunaga, K. 
& Kazazian, H. H. Jr Genome Res. 19, 1983-1991 
(2009). 

9. Dominski, Z. & Kole, R. Proc. Nat! Acad. Sci. USA 90, 
8673-8677 (1993). 

10.Muntoni, F. & Wood, M. J. A. Nature Rev. Drug Discov. 
10, 621-637 (2011). 


The gentle cooling 
touch of light 


Laser light has been used to cool a nanomechanical resonator to its lowest energy 
state. The result opens the door to testing the principles of quantum mechanics and 
to applications in quantum information processing. SEE LETTER P.89 
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hen you face direct sunlight, besides 

the brightness and heat that you 

experience, there is a rather sub- 
tle effect. The light produces a force pushing 
at you — admittedly a tiny one, correspond- 
ing to the weight of a few grains of sand. In 
the past few years, however, researchers have 
learned how to harness these light forces in the 
nanoworld and to use them to manipulate the 
mechanical vibrations of small objects, with 
remarkable results. On page 89 of this issue, 
Painter and colleagues (Chan et al.) describe 
how they have exploited laser light to dampen 
the motion of a nanomechanical resonator. 
On entering the quantum regime, the vibra- 
tional energy of the resonator is no longer 


continuous. Instead, it is in the form of discrete 
quanta called phonons. The authors’ experi- 
ment is the first successful attempt of this type 
to squeeze essentially all the phonons out of 
the resonator, leaving the system's vibrations 
in the lowest possible energy state allowed by 
quantum mechanics — the ground state. Their 
results finally pave the way for using light to 
realize many quantum-physical phenomena 
in such structures. 

The force exerted by light, called the radia- 
tion pressure force, was first demonstrated 
a little more than 100 years ago. Radiation 
forces have been remarkably successful in 
manipulating the motion of atoms (for exam- 
ple, in laser-cooling them or trapping them 
within optical lattices produced by the inter- 
ference of laser beams). They have also been 
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descendants of this particular event probably 
suffered no harm, because disrupting one copy 
of fukutin has no adverse effects. 
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‘donor’ site in the final exon (protein-coding 
region) that had previously been inaccessible 
was activated and became joined to a newly 
created splice ‘acceptor site in the retrotrans- 
poson sequence, a process known as exon trap- 
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first example to show a clear association with 
disease. Because of this splicing alteration, the 
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it is spliced, thereby favourably altering the 
outcome of the splicing process. This approach 
had been used previously to suppress or shift 
splicing sites in other disease states, including 
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derived from patients with FCMD, the anti- 
sense oligonucleotides had the intended effect 
of blocking the deleterious splicing event. As 
predicted, this led to re-expression of normal 
fukutin protein and re-establishment of the 
link between a-dystroglycan and extracellular- 
matrix proteins. Injecting these oligonucleo- 
tides into mice carrying the retrotransposon 
insertion partially restored normal fukutin 
protein in muscle tissue, again with improved 
a-dystroglycan glycosylation. 

Could this strategy be adopted to treat 
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presumably require treatment in utero, a dif- 
ficult undertaking. The major challenge to 
the use of antisense oligonucleotides for splice 
blocking is distributing them into cells in 
sufficient quantity to influence splicing pro- 
cesses. Although progress has been made in 
addressing this problem”, a general solution, 
applicable for brain and muscle tissue, is not 
yet available. It is also possible that a similar 
approach could be applied to other genetic 
disorders associated with retrotransposon 
insertion and exon trapping. = 
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Silicon 
nanobeam 


Figure 1 | Coupling light and mechanical motion. a, Chan et al.' patterned 
a free-standing silicon nanobeam with holes to trap incoming laser 

light in its central region. This design allowed them to couple the light to 

the nanobeam’s mechanical vibrations (not shown) and bring a particular 
vibrational standing wave to the quantum-mechanical ground state. b, The 
team already has designs" for two-dimensional photonic-crystal 


used to manipulate larger objects such as 
glass beads, whose motion can be controlled 
through ‘optical tweezers. 

Over the past few years, similar ideas have 
been applied to control the vibrational motion 
of nanofabricated structures. A typical set-up 
involves using a laser to illuminate an opti- 
cal cavity — an arrangement of two reflective 
mirrors that allows light to bounce back and 
forth between them — in which the circulat- 
ing radiation exerts a force on a mechanical 
element, such as a vibrating cantilever carrying 
one of the cavity’s mirrors. A large variety of set- 
ups is being investigated in this rapidly growing 
field of cavity optomechanics*. They involve, for 
example, not only membranes, microtoroids 
and nanoscale slabs termed nanobeams, but also 
vibrating structures coupled to superconducting 
electrical devices that are driven by microwave 
radiation instead of by laser light. The field is 
motivated both by fundamental questions 
about quantum mechanics and by more applied 
aspects, such as the ultrasensitive detection of 
small displacements or forces and possible uses 
in quantum information processing. 

To enter the quantum regime, mechanical 
vibrations have to be as cold as possible, which 
can be achieved by laser cooling. The basic idea 
is simple enough: send in laser light consist- 
ing of photons that do not have quite enough 
energy to enter the optical cavity, except when 
they grab an extra quantum of energy from the 
mechanical vibrations thereby cooling them. 
The essence of radiation-induced damping 
of mechanical vibrations was demonstrated’ 
in 1970 in a macroscopic set-up. In 2004, this 
principle was first applied to cooling a micro- 
mechanical resonator using a force created 
by the thermal effects of light’, and in 2006 
three groups” ’ showed the kind of radiation- 
pressure laser cooling that has now' finally 
led to cooling a nanomechanical resonator 
to the quantum ground state. Nevertheless, it 
proved hard to find a system that combined a 
sufficiently strong light-mechanics coupling 
with a weak enough coupling to the thermal 


Trapped light 


the structures. 


environment, and to pre-cool the system to low 
temperatures using standard methods. 

Chan et al.' have now overcome these chal- 
lenges. Their experiment is based on a design 
introduced two years ago by Painter and 
colleagues’. A silicon nanobeam that has a 
suitable arrangement of holes (forming a pho- 
tonic crystal) traps incoming laser light in its 
central area, in a region not much larger than 
the wavelength of light, essentially forming an 
optical cavity (Fig. 1). The beam is free stand- 
ing, so it can vibrate, and there are standing 
waves of mechanical vibrations localized at the 
area where the light is trapped. These are of a 
high (gigahertz) frequency, making it easier to 
cool them, and the strong overlap between the 
tightly localized light field and the mechani- 
cal vibrations yields an exceptionally large 
optomechanical coupling. 

In addition, the team exploited the design 
flexibility of this ‘optomechanical crystal’ 
device to engineer a structure in which the 
damping of vibrational motion is strongly 
reduced. The combination of all these factors 
led to successful laser cooling to the ground 
state: starting with 100 phonons at a tempera- 
ture of about 20 kelvin, the team’ was able to 
reduce the energy of a particular vibrational 
standing wave to less than one phonon on 
average. Together with a recent analogous 
experiment’ performed in the microwave 
domain, Chan and colleagues’ study’ opens the 
door to exploring the quantum regime of cavity 
optomechanics. 

With the latest advance, it will now become 
possible to produce non-classical states of 
light and mechanical motion. One example 
would be the generation and detection of 
quantum entanglement in the system — cor- 
relations between the light and the mechani- 
cal motion that are stronger than anything 
possible in classical physics. Ultimately, light 
could even be used to create entanglement 
between mechanical objects separated by a 
distance. Another enticing prospect is to engi- 
neer optomechanical arrays and circuits that 
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structures — similar to the one shown here — that might form the 

basis for optomechanical circuits in which light and mechanical motion 
could be coupled to one another and to optical (blue) and acoustic (red) 
waveguides. The devices’ mechanical and optical functionalities are 
purely the result of carefully engineering the shapes of the holes cut into 


couple many optical and mechanical oscil- 
lations. Such designs could integrate several 
functionalities, for applications such as sens- 
ing and signal processing, or could be used to 
study the collective dynamics of photons and 
phonons ona chip. If the coupling between a 
single photon and a single phonon could be 
increased 500-fold from the value achieved 
here, thus making it larger than the photon 
decay rate in the current set-up’, then inter- 
esting nonlinear quantum effects could be 
observed. 

Finally, researchers in quantum informa- 
tion science are delighted by the prospect of 
making a device, possibly based on the Painter 
team’s design”, that converts single phonons 
to photons. Combining such a device with the 
already demonstrated” strong coherent cou- 
pling between a nanomechanical resonator 
and a superconducting two-state quantum sys- 
tem, or qubit, it might be possible to realize an 
interface between such solid-state qubits and 
photons, which is much needed for quantum 
communication applications. m 
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Persistence of soil organic matter as an 


ecosystem property 


Michael W. I. Schmidt'*, Margaret S. Torn?**, Samuel Abiven!, Thorsten Dittmar**, Georg Guggenberger®, Ivan A. Janssens’, 
Markus Kleber®, Ingrid Kégel-Knabner’, Johannes Lehmann’®, David A. C. Manning", Paolo Nannipieri'*, Daniel P. Rasse’’, 


Steve Weiner'* & Susan E. Trumbore’” 


Globally, soil organic matter (SOM) contains more than three times as much carbon as either the atmosphere or terrestrial 
vegetation. Yet it remains largely unknown why some SOM persists for millennia whereas other SOM decomposes 
readily—and this limits our ability to predict how soils will respond to climate change. Recent analytical and 
experimental advances have demonstrated that molecular structure alone does not control SOM stability: in fact, 
environmental and biological controls predominate. Here we propose ways to include this understanding in a new 
generation of experiments and soil carbon models, thereby improving predictions of the SOM response to global warming. 


nderstanding soil biogeochemistry is essential to the stewardship 
of ecosystem services provided by soils, such as soil fertility (for 
food, fibre and fuel production), water quality, resistance to ero- 
sion and climate mitigation through reduced feedbacks to climate change. 
Soils store at least three times as much carbon (in SOM) as is found in either 
the atmosphere or in living plants’. This major pool of organic carbon is 
sensitive to changes in climate or local environment, but how and on what 
timescale will it respond to such changes? The feedbacks between soil 
organic carbon and climate are not fully understood, so we are not fully 
able to answer these questions” ’, but we can explore them using numerical 
models of soil-organic-carbon cycling. We can not only simulate feedbacks 
between climate change and ecosystems, but also evaluate management 
options and analyse carbon sequestration and biofuel strategies. These 
models, however, rest on some assumptions that have been challenged 
and even disproved by recent research arising from new isotopic, spectro- 
scopic and molecular-marker techniques and long-term field experiments. 
Here we describe how recent evidence has led to a framework for 
understanding SOM cycling, and we highlight new approaches that 
could lead us to a new generation of soil carbon models, which could 
better reflect observations and inform predictions and policies. 


The conundrum of SOM 


About a decade ago, a fundamental conundrum was articulated*: why, 
when organic matter is thermodynamically unstable, does it persist in soils, 
sometimes for thousands of years? Recent advances in physics, material 
sciences, genomics and computation have enabled a new generation of 
research on this topic. This in turn has led to a new view of soil-organic- 
carbon dynamics—that organic matter persists not because of the intrinsic 
properties of the organic matter itself, but because of physicochemical and 
biological influences from the surrounding environment that reduce the 
probability (and therefore rate) of decomposition, thereby allowing the 
organic matter to persist. In other words, the persistence of soil organic 
carbon is primarily not a molecular property, but an ecosystem property. 


This emerging view has not been fully implemented in global models or 
research design, for a variety of reasons. First, the knowledge gathered in 
the past decade has often been published in outlets of traditionally sepa- 
rated disciplines. As a result, confusion has arisen because these different 
disciplines can use the same vocabulary to mean different things, or vice 
versa. For example, ‘decomposition rates’ may mean the rate of mass loss 
of fresh litter, the production rate of CO, in a laboratory incubation, or the 
rate inferred from input and loss of an isotopic tracer present in plant 
inputs to soil”'®. Second, the complexity of the soil system is difficult to 
incorporate into one conceptual model or to translate into a tractable yet 
accurate numerical model. Soil is a realm in which solid, liquid, gas and 
biology all interact, and the scale of spatial structures spans many orders of 
magnitude (from nanometre minerals to football-sized soil clods). Indeed, 
the spatial heterogeneity of biota, environmental conditions and organic 
matter may have a dominant influence on carbon turnover and trace gas 
production in soils. Last, the new knowledge remains more qualitative 
than quantitative. In many cases, it tells us what is important and suggests 
new model structures, but not how to parameterize them. 


Recent insights into carbon cycling 
Since pioneering work in the 1980s'', new insights gathered across 
disciplines (ranging from soil science to marine science, microbiology, 
material science and archaeology) have challenged several foundational 
principles of soil biogeochemistry and ecosystem models; in particular, 
the perceived importance of the ‘recalcitrance’ of the input biomass (the 
idea that molecular structure alone can create stable organic matter) and 
of humic substances (biotic or abiotic condensation products). New 
observations show these to be only marginally important for organic 
matter cycling’*'’. Furthermore, loose use of the term ‘recalcitrance’ has 
significantly confused the discussion in the past. 

We need to ensure that the conceptual framework that supports our 
understanding of soil carbon cycling is consistent with observations and 
has a mechanistic basis, as only then can we start to make the necessary 
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advances in terrestrial ecology and improve our ability to predict soil 
responses to changes in climate, vegetation or management. Here we 
articulate key insights into soil carbon cycling synthesized from research 
of the past decade, and describe the research challenges they pose for the 
coming decade. 


Molecular structure and decomposition 

The initial decomposition rate of plant residues correlates broadly with 
indices of their bulk chemical composition, such as the nitrogen content or 
the fraction of plant residue that cannot be solubilized by strong acid 
treatments (often operationally defined as ‘lignin’)'*. Accordingly, the 
molecular structure of biomass and organic material has long been 
thought to determine long-term decomposition rates in the mineral soil. 
However, using compound-specific isotopic analysis, molecules predicted 
to persist in soils (such as lignins or plant lipids) have been shown to turn 
over more rapidly than the bulk of the organic matter (Fig. 1)'*'*""”. 
Furthermore, other potentially labile compounds, such as sugars, can 
persist not for weeks but for decades. We therefore cannot extrapolate 
the initial stages of litter decomposition to explain the persistence of 
organic compounds in soils for centuries to millennia—other mechanisms 
protect against decomposition. Perhaps certain compounds require co- 
metabolism with another (missing) compound, or microenvironmental 
conditions restrict the access (or activity) of decomposer enzymes (for 
example, hydrophobicity, soil acidity, or sorption to surfaces"). 


Soil humic substances 

The prevalence of humic substances in soil has been assumed for 
decades’’. Previous generations of soil chemists relied on alkali and acid 
extraction methods” and observations of the extracted (or residual) 


functional-group chemistry to describe the presence of operationally 
defined ‘humic and fulvic acids’ and ‘humin’. Humic substances were 
thought to comprise large, complex macromolecules that were the 
largest and most stable SOM fraction. However, we now understand 
that these components represent only a small fraction of total organic 
matter'**"*; direct, in situ observations, rather than verifying the 
existence of these large, complex molecules, in fact find smaller, simpler 
molecular structures, as visualized in Fig. 2 (refs 13, 22, 23). Some of 
what is extracted as humic acids may be fire-derived**”, although these 
compounds are rare in soil without substantial fire-derived organic 
matter. In any case, there is not enough evidence to support the hypo- 
thesis that the de novo formation of humic polymers is quantitatively 
relevant for humus formation in soils. 


Fire-derived organic matter 

Fire-derived organic matter (also called char, black carbon or pyrolysed 
carbon) is found in many soils, sediments and water bodies, and can 
comprise up to 40% of total SOM in grasslands and boreal forests”®. It is 
not inert, but its decomposition pathways remain a mystery. Fire-derived 
carbon was suspected to be more stable in soil than other organic matter 
because of its fused aromatic ring structures and the old radiocarbon ages 
of fire residues isolated from soil?”. However, fire-derived carbon does 
undergo oxidation and transport, as we now know from archaeological 
settings’*, soils*’°, and from breakdown products in river*' and ocean 
water**’, In a field experiment, fire-derived residues were even observed 
to decompose faster than the remaining bulk organic matter, with 25% lost 
over 100 years (ref. 29). Spectroscopic characterization shows that com- 
bustion temperature affects the degree of aromaticity and the size of 
aromatic sheets, which in turn determine short-term mineralization 
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Figure 1 | Molecular structure does not control long-term decomposition of 
soil organic matter (SOM). Certain plant-derived molecules (classically, long- 
chain alkanoic acids, n-alkanes, lignin and other structural tissues) often persist 
longer than others while plant biomass is decaying. In mineral soil, however, 
these relatively persistent components appear to turn over faster than the bulk 
soil (top row), except for fire-derived organic matter (bottom row). Even 
components that appear chemically labile, including proteins and saccharides 
of plant and microbial origin (“Different biological sources’), instead seem to 
turn over (on average) at rates similar to those of bulk SOM, that is, on the order 
of years or even decades. Thus, over time, the importance of initial quality fades 
and the initially fast-cycling compounds are just as likely to persist as the 
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slow'*”’. This figure compiles data from surface horizons of 20 long-term field 
experiments (up to 23 years) in temperate climate, using '*C labelling to trace 
the residence time of bulk SOM and of individual molecular compounds. The 
variation in turnover time is also seen in the compounds of microbial origin 
analysed for '°C content, phospholipid fatty acids (PLFA) produced by Gram- 
negative and Gram-positive bateria and amino sugars (hexosamines). Redrawn 
from ref. 15 (with permission); for clarity, we have excluded outliers, and we 
have added the tentative data on fire-derived organic matter. Data points: thin 
horizontal lines, 10th and 90th percentiles; box, 25th and 75th percentiles; 
central vertical line, median. 
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Figure 2 | In soil, the existence of humic substances has not been verified by 
direct measurements. a, Based on chemical analysis of the extracted materials 
(Observed), the de novo formation of humic polymers (Interpretation) was 
postulated to be an important source of recalcitrant SOM. b, Direct high- 
resolution in situ observations with non-destructive techniques (Observations) 
have been able to explain the functional group chemistry of the extracted humic 
substances as relatively simple biomolecules (Interpretation), without the need 
to invoke the presence of unexplainable macromolecules'*’. Moreover, the 
chemical mixture of SOM is spatially distinct on a nanometre scale, and the 
aromatic/carboxylate-rich compounds characteristic of the bulk extracted 
humic substances have not been found in situ even when looking at the 
submicrometre scale (using near-edge X-ray fine structure spectroscopy 
combined with scanning transmission X-ray microscopy)”. 


rates****~°°, To reconcile the observations of decomposability with the old 
radiocarbon ages of fire-derived carbon deposits’”*, it has been suggested 
that physical protection and interactions with soil minerals play a signifi- 
cant part in black-carbon stability over long periods of time”. 


Influence of roots 

Root-derived carbon is retained in soils much more efficiently than are 
above-ground inputs of leaves and needles’. Isotopic analyses and 
comparisons of root and shoot biomarkers confirm the dominance of 
root-derived molecular structures in soil’’ and of root-derived carbon in 
soil microorganisms. Preferential retention of root-derived carbon has 
been observed in temperate forests***°, for example, where below-ground 
inputs, including fungal mycelia, make up a bigger fraction of new carbon 
in SOM than do leaf litter inputs***”. In addition to many above-ground 
inputs being mineralized in the litter layer, root and mycorrhizal 
inputs have more opportunity for physico-chemical interactions with 
soil particles*®. At the same time, fresh root inputs may ‘prime’ microbial 
activity, leading to faster decomposition of older organic matter**”” as 
well as changing community composition®’. Carbon allocation by plants 
thus plays an important part in soil carbon dynamics, but it is not known 
how future changes in plant allocation will affect soil carbon stocks*’. 


Physical disconnection 

The soil volume occupied by micro-organisms is considerably less than 
1%: this occupied volume is distributed heterogeneously in small-scale 
habitats, connected by water-saturated or unsaturated pore space'*. The 
availability of spatially and temporally diverse habitats probably gives 
rise to the biodiversity that we see in soil, but this fragmentation of 
habitat may restrict carbon turnover. At present, we are far from being 
able to quantify the complex processes of soil structure development and 
fragmentation, which have different space scales and timescales depending 
on soil type, texture and management’. The physical disconnection 
between decomposer and organic matter is likely to be one reason for 
persistence of deep SOM. The specific pedological processes operating in 
a given soil type that influence the distribution of organisms and substrates, 
such as bioturbation and formation of preferential flow paths, need to be 
taken into account to understand and quantify subsoil carbon dynamics, 
and thus its vulnerability to decomposition”. 
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Deep soil carbon 

There is a lot more deep soil carbon than we once thought, and the 
underlying processes inhibiting its turnover are still largely unknown. 
Despite their low carbon concentrations, subsoil horizons contribute to 
more than half of the global soil carbon stocks*. In fact, the response of 
deep soils to land-use change can equal that from the top 30 cm of soil, 
even though typically only the shallow depths are explicitly represented 
in models”. Inputs of carbon to the subsoil include dissolved organic 
matter, root products, and transported particulates from the surface”, 
but the relative importance of different sources is not known”®. Based on 
depth trends of elemental composition (decreasing C/N ratio), isotopic 
composition (increasing 8'°C values) and individual organic com- 
pounds, microbial products make up more organic matter in subsoil 
horizons than do plant compounds”. 

Organic matter in subsoil horizons is characterized by very long turnover 
times that increase with depth—radiocarbon ages of 1,000 to > 10,000 years 
are common—but the reasons for this are not clear. Microbial activity may 
be reduced by suboptimal environmental conditions, nutrient limitation or 
energy scarcity, and organic matter may be less accessible because of its 
sparse density or association with reactive mineral surfaces. Microbial 
biomass decreases with soil depth**, and community composition changes 
to reflect an increase in substrate specialization”. Recent studies suggest 
that energy limitation, or the converse—‘priming’ (see below) by root 
exudates or dissolved organic carbon—is an important factor in the subsur- 
face**’. Most studies concerning these factors, however, have been con- 
ducted in the laboratory, and their relevance in situ needs evaluation. If we 
do not understand these mechanisms of stabilization, we cannot predict the 
vulnerability of deep SOM to change. 


Thawing permafrost 

Permafrost soils store as much carbon (up to 1,672 X 10° g; ref. 60) 
as was believed a decade ago to exist in all soils worldwide. During 
permafrost thaw, which is expected to become widespread owing to 
climate change, much of this SOM may be vulnerable to rapid 
mineralization” if it is primarily stabilized by freezing temperatures”. 
There is evidence that old carbon is mobilized following permafrost 
thaw®'®*, which indicates that organic matter previously locked in the 
permafrost is highly vulnerable. Moreover, the accelerated decom- 
position may increase nitrogen availability, which would amplify the 
direct effects of warming on microbial activity. Alleviation of nitrogen 
limitation in tundra experiments led to large and rapid carbon losses, 
including older carbon®*®*. Over the very long term, however, formation 
of pedogenic reactive minerals in former permafrost soils may act to 
stabilize SOM***’, and development of soil structure may lead to phys- 
ical disconnection between organic matter and decomposers. 

Despite some important recent research, surprisingly little is known 
about permafrost biogeochemistry and how the landscape would evolve 
with warming. Key questions surround the extent to which permafrost 
carbon is additionally stabilized by other processes beyond freezing, and 
the extent to which the active layer becomes saturated and anaerobic. The 
extent, rates and spatial variability of these processes are still largely 
unknown for permafrost soils, forming one of the major uncertainties 
in predicting climate-carbon feedbacks. 


Soil micro-organisms 

Soil microbial diversity and activity can be characterized at molecular 
resolution, but the quantitative linkages to ecosystem function are 
uncertain®. Soil micro-organisms influence SOM cycling not only 
via decomposition but also because microbial products are themselves 
important components of SOM”. Asa result, environmental change can 
influence soil carbon cycling through changes in both metabolic activity 
and community structure. For example, microbial community shifts 
following nitrogen additions can have large effects on decomposition 
rates°°'. New genetic and protein-based tools enable the quantification 
of soil microbiological abundance and functioning (for example, enzym- 
atic gene expression), and can describe the microbial community com- 
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position with very high taxonomic resolution’’. Nevertheless, the chal- 
lenge remains of synthesizing this immense amount of detailed informa- 
tion” and linking it to the rates and routes of SOM processing. To 
quantitatively relate microbial genomics to ecosystem function, we need 
a better understanding of microbial functional redundancy. 


Implications of these insights 


Taken together, these eight insights paint a broad picture of carbon cycling 
in soil that has implications for fundamental research, land management, 
and climate change prediction and mitigation (Fig. 3). They suggest that 
the molecular structure of plant inputs and organic matter has a secondary 
role in determining carbon residence times over decades to millennia, and 
that carbon stability instead mainly depends on its biotic and abiotic 
environment (it is an ecosystem property). Most soil carbon derives from 
below-ground inputs and is transformed, through oxidation by micro- 
organisms, into the substances found in the soil. By moving on from the 
concept of recalcitrance and making better use of the breadth of relevant 
research, the emerging conceptual model of soil organic carbon cycling 
will help to unravel the mysteries surrounding the fate of plant- and fire- 
derived inputs and how their dynamics vary between sites and soil depths, 
and to understand feedbacks to climate change. We argue that the per- 
sistence of organic matter in soil is largely due to complex interactions 
between organic matter and its environment, such as the interdependence 
of compound chemistry, reactive mineral surfaces, climate, water avail- 
ability, soil acidity, soil redox state and the presence of potential degraders 
in the immediate microenvironment. This does not mean that compound 
chemistry is not important for decomposition rates, just that its influence 
depends on environmental factors. Rather than describing organic matter 
by decay rate, pool, stability or level of ‘recalcitrance’—as if these were 
properties of the compounds themselves—organic matter should be 
described by quantifiable environmental characteristics governing stabil- 
ization, such as solubility, molecular size and functionalization”’. 


Soil response to global environmental change 
We now consider how these insights affect our use of numerical models. 
Such models are powerful tools for quantifying the complex interactions 


a Historical view 


and feedbacks that will underpin soil responses to global change. A variety 
of models that include SOM dynamics have informed our response to 
environmental issues, including agricultural management, bioremediation 
and environmental water research”. Most model testing, however, has 
been at local-to-regional spatial scales, spanning seasons to decades 
(although the century-long Rothamsted experiments are a noteworthy 
exception). In the long term or at a global scale, mechanisms of SOM 
stabilization and destabilization that are not currently embedded in models 
have the potential to dominate soil carbon dynamics, making it vital that 
models are correct for the right reasons. Recent model intercomparisons 
reveal large differences among predictions of soil carbon stocks and fluxes 
in the next century, for example’, demonstrating how sensitive global 
carbon cycling is to assumptions about SOM decomposition dynamics. 
Recent advances in our mechanistic understanding of soils, such as 
those described above, have not yet been incorporated into the widely 
used models of SOM cycling, which are all structured around the idea 
that a type, or pool, of organic material will have an intrinsic decay 
rate’*’*. These models rely on simple proxies—such as soil texture as 
a surrogate for sorption and other organo-mineral interactions, and 
litter quality (such as lignin:N ratios or structural carbon groupings) 
as a means of partitioning plant inputs into pools of different turnover 
times—but in general these parameters are not consistent with the 
observations that are starting to emerge. Global models largely ignore 
deep mineral soils and are only now beginning to address the accumula- 
tion and loss of carbon in peatlands and permafrost”. Even more impor- 
tantly, parameterizations based on litter chemistry may correlate well to 
initial rates of litter decomposition, but they have little relationship to 
the rates of decomposition for microbial residues or to organic matter 
sorbed to mineral surfaces or isolated in aggregates. Moreover, most 
models that make any allowances for microbial biomass treat it as a 
pool of carbon, rather than as an agent that affects the decomposition 
rate of SOM. The large disagreement among predictions of soil carbon 
fluxes in a warmer world highlights both the complexity of the many 
potential feedbacks to climate and the uncertainty that arises as a result’. 
How does the perspective that SOM persistence is an ecosystem 
property inform our understanding of the response of decomposition 


Fresh plant iter (eaves) 


@ Condensation ~ _ 


| oie =a 
oo) Seactions> -o¢ d cal 
Peo 44, ¢ disconnectio 


enzymes, 


_ Creation of — , 
new stable — 


compounds) ss | 
a at iain 


i (0 


Ph 
CO 


Yysi 
nn 


e-acceptors 


@ Molecular structure 
determines timescale 
of persistence 


(6) 


Figure 3 | A synopsis of all eight insights, contrasting historical and 
emerging views of soil carbon cycling. The historical view (a) has emphasized 
above-ground plant carbon inputs and organic matter in the top 30 cm of soil. 
Stable organic matter is seen to comprise mainly selectively preserved plant 
inputs and de novo synthesis products like humic substances, whose chemical 
complexity and composition render them nearly inert relative to microbial 
degradation. The emerging understanding (b) is that the molecular structure of 
organic material does not necessarily determine its stability in soil (1; molecular 
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structure). Rather, SOM cycling is governed by multiple processes (5) shaped 
by environmental conditions (such as physical heterogeneity). Plant roots and 
rhizosphere inputs (4; roots) make a large contribution to SOM, which is 
mainly partial degradation and microbial products and fire residues (3) rather 
than humic substances (2). The vulnerability of deep soil carbon (6; deep 
carbon) to microbial degradation (8; soil micro-organisms) in a changing 
environment, such as thawing permafrost (7; thawing permafrost) remains a 
key uncertainty. 
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to warming? The conventional assumption that older SOM is re- 
calcitrant implies that this large carbon pool is highly temperature sensi- 
tive, because Arrhenius kinetics tells us that reactions with higher 
activation energies are more temperature-sensitive than those with 
low activation energies*’**. Our ecosystem perspective suggests that 
the mechanisms governing the timing and magnitude of a response to 
a change in temperature are far more complex than this, as further 
physical, chemical and biological mechanisms controlling decomposi- 
tion and stabilization would also be affected*'****. A recent incubation 
study of soils from a wide range of sites found that lower initial decom- 
position rates were associated with higher temperature sensitivity but 
not with any change in SOM quality indices®, suggesting that multiple 
stabilization mechanisms are temperature sensitive. Nevertheless, it is 
not yet possible to predict the integrated response of decomposition to 
changes in climate. In fact, we could use the ability to accurately predict 
temperature response as a guide to the degree of mechanistic repres- 
entation that we need in our next generation of soil carbon models. 


Phyto-engineering 

Phyto-engineering to produce plant tissues high in chemical compounds 
resistant to rapid mineralization, such as plant lipids and lignin, has been 
suggested as a means to increase carbon sequestration®®. This strategy is 
called into question, however, if the molecular structure of plant 
compounds does not determine stability on the timescales necessary 
for significant carbon sequestration'”*’. More generally, sequestration 
strategies based on adding recalcitrant material to soils, whether through 
plant selection for recalcitrant tissues or through biochar amendments, 
must be re-evaluated. Enhancing root carbon input to soils might be a 
more promising avenue, but it is not known what root properties influ- 
ence rhizodeposition rates or stability”, or the extent to which root inputs 
will stimulate (prime) decomposition of other SOM. 


Biochar 

Biochar (intentionally pyrolysed biomass) has gained much attention in 
recent years as a means to increase soil fertility and store carbon in soil for 
decades to centuries**. However, certain types of biochar can degrade 
relatively rapidly in some soils, probably depending on the conditions 
under which they were produced, which suggests that pyrolysis could 
be optimized to generate a more stable biochar. But as with natural fire 
residues, persistence over the long term may also be affected by interac- 
tions with minerals and by soil conditions (for microorganisms capable of 
char oxidation and for abiotic oxidation). Whether interactions of fire- 
derived carbon with soil minerals may be manipulated to enhance 
stability, and what the trade-offs might be with fertility benefits, are not 
known. Biochar is likely to be a useful part of sequestration-mitigation 
strategies, but more understanding of the variation in its decay rates is 
needed before we can develop simple (that is, policy-relevant) quantitative 
relationships between biochar additions and expected sequestration. 


Vulnerability of soil to degradation 

The vulnerability of SOM to degradation will depend on the nature of the 
disturbance as well as the stabilization and destabilization mechanisms at 
play ina given ecosystem. Hence, as with carbon stability, the vulnerability 
of soil stocks should not be assessed according to the classes of organic 
matter present, but rather according to the mechanisms through which 
organic matter is stabilized or made assimilable in that soil, and how these 
interacting physical, chemical and biological factors respond to change”®. 
Improved understanding of SOM destabilization is needed to enhance 
efforts to avoid soil degradation and accelerate recovery of degraded soils. 


The way forward 

Soils are now in the ‘front line’ of global environmental change—we 
need to be able to predict how they will respond to changing climate, 
vegetation, erosion and pollution so that we can better understand their 
role in the Earth system and ensure that they continue to provide for 
humanity” and the natural world. The conceptual framework of soil 
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carbon cycling presented here, that residence time is a property of the 
interactions between organic matter and the surrounding soil eco- 
system, will help us get nearer to these goals’. This will require developed 
and entirely new lines of research and modelling, including: (1) applying 
a new generation of field experiments and analytical tools to study the 
processes driving SOM stabilization and destabilization; (2) developing a 
new generation of soil biogeochemistry models that represent the 
mechanisms driving soil response to global change; and (3) joining forces 
and connecting the disparate research communities that are studying, 
managing and predicting SOM cycling and terrestrial ecology. 


The next generation of experiments 

Although not a novel recommendation, we cannot overstate the need for 
long-term, manipulative experiments designed to test soil-based hypo- 
theses. In some countries, long-term ecological observational networks 
already exist, but most were designed with vegetation or hydrology goals. 
Many are in danger of being discontinued. Although preserving these 
experiments is crucial, they may not be sufficient to untangle individual 
soil processes. In the near term, new disciplines and techniques could be 
applied to ongoing experiments, allowing the investigation of changes that 
occurred after decades of manipulation”’. Focus is needed on long-term, 
controlled manipulations of entire soil profiles (that is, to a metre or more 
depth) to investigate distinct mechanisms in situ, and on observatories 
allowing quantification of budgets, such as large-scale lysimeters. In addi- 
tion, research approaches are needed that combine manipulations with 
spatial gradients—and thus timescales—for variables and processes of 
interest. These new experiments should be designed to help determine 
the key soil functional traits for understanding and modelling thresholds 
in SOM storage and loss. Such traits, including soil depth, mineral charge 
density and pH, vary spatially, but we suggest that their spatial distribu- 
tions are ultimately predictable according to geologic setting, disturbance 
and management history, climate and ecosystem plant characteristics—in 
other words, the six state factors: climate, organisms, relief, parent material, 
time and human activity”’. One of the major weaknesses of current models 
is the lack of representation of edaphic characteristics (that is, those 
physical and chemical features that are intrinsic to the soil)—and the fact 
that the major stabilization mechanisms will vary spatially with soil type 
and topographic positions. 


Tracing pathways, fluxes and biology 

When combined with manipulative experiments, new analytical tech- 
niques and instrumentation to study elements, isotopes and molecules 
in terrestrial ecosystems offer great potential for revealing the mechan- 
isms underpinning soil carbon stability. Advances in physics, material 
sciences, genomics and computation continue to create new research 
opportunities. Because many, if not most, organic molecules in soils 
are of microbial origin, experiments are needed that identify the long- 
term drivers of microbial-cell and microbial-product decomposition, 
rather than focusing on the immediate fate of fresh plant material. 

We propose extending the systems biology approach to the non- 
living environment that surrounds organisms. Individual molecules could 
then be traced back from the soil into the cell via metabolic pathways and 
specific gene expressions. As with medicine, where structure-function 
relationships led to the development of genomics and proteomics, allow- 
ing illness to be treated before symptoms appear, we foresee that the 
integration of molecular, biological and physical information will provide 
soil science with a more mechanistic basis for predictions. 

However, a major obstacle remains. The molecular complexity of 
SOM is extraordinary, and the metabolic products of higher plants 
and the diverse soil microbial community are mixed together in a 
three-dimensional inorganic soil matrix. An essential step to overcom- 
ing this obstacle is the identification of intact molecular structures in 
soils. In marine and freshwater sciences, ultrahigh-resolution (Fourier- 
transform ion cyclotron resonance) mass spectrometry has made it 
possible to identify tens of thousands of organic compounds in water 
and soil water, a major step towards implementing a systems biology 
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approach in soils*’. Ultrahigh-resolution mass spectrometry could be 
applied to soils by combining it with ionization techniques such as 
matrix-assisted laser desorption ionization. 

For soils, any high-resolution molecular technique will have to be 
combined with imaging techniques in order to address spatial relation- 
ships and heterogeneity. Moreover, new spatial techniques will help 
uncover how microbial community structure, enzyme production and 
decomposition activity are influenced by environmental conditions and 
plant processes. For example, combining fluorescence in situ hybridiza- 
tion, which produces a spatial map of the microbial community, with 
secondary ion mass spectrometry imaging on a nanometre scale, can be a 
powerful way to link biota with processes at the submicrometre scale”””’. 

Beyond imaging, new methods to trace particle and solute transport 
(for example, viral DNA labels) can help us to understand the processes 
linking deep and surface soils, and isotopic advances reveal both the 
movement and the chemical transformation of carbon in soil. The value 
of isotopically labelled inputs has been greatly amplified by new tools 
that allow precise measurements on small samples: it is now possible to 
follow labelled elements in the environment (for example, 140), and to 
‘fingerprint’ specific plant compounds and microbial products in soil, 
and therefore to determine how decomposition pathways and substrate 
ages interact. For measuring carbon isotopic values in gas and water 
fluxes, we no longer need to rely on weekly measurements, which carry 
the danger of missing episodic but crucial events. Quantum cascade and 
cavity ring-down laser spectroscopy allow high-frequency observations 
of the molecular and isotopic composition of soil gas, efflux and water. 


Integrative computational databases 

Advanced analytical methods are generating an ever-increasing amount 
of data on SOM. Soil DNA databases are currently being developed 
nationally” and internationally (for example, soil microbial genomics 
libraries). In parallel, data-fusion-type techniques are now being applied 
to the large data sets generated by microbial fingerprinting and spectro- 
metric methods”. However, we are still lacking the integrative databases 
and computational tools that would enable us to identify significant 
relationships between the detailed molecular composition of soil 
organic carbon, expressions of microbial and plant activities, and soil 
environmental conditions. The development of international libraries 
and high-performance computational databases for molecular SOM 
research will be necessary to take advantage of new molecular tools 
and create synthesis across analytical platforms. 


Mechanisms in global ecosystem models 

The current soil models embedded in Earth system models are struc- 
tured around 3-5 pools of organic substrates, with transformation rates 
modified by empirical correlations to soil temperature and water con- 
tent, and with clay content as a proxy for mineral stabilization of organic 
matter’*”*. These models have proven extremely useful, but in the case 
of long-term feedbacks to climate, a more realistic representation of 
what governs organic-matter stability will be needed to more accurately 
inform predictions and policies. Table 1 characterizes what the most 
widely used ecosystem models do, relative to the insights described 
above, and gives our recommendations for how to incorporate recent 
advances into models. Many of the recommendations in Table 1 are 
what we consider ‘low-hanging fruit’: that is, modifications that could be 
made in existing model frameworks, with existing knowledge and data, 
and that should make significant improvements. 

Specifically, we suggest changes along three ‘axes’. First, decay of 
organic matter is a biological process and should be treated as such in 
models. For example, changes in below-ground carbon allocation or in 
nutrient availability may alter microbial community composition and 
activity, which in turn will alter the rate of degradation and the types 
of organic matter that are degraded. Rhizospheric inputs of easily 
assimilable substrates may aid in, or prime, the decomposition of com- 
pounds that would otherwise be selectively avoided by microorganisms”. 
This breadth of biological influence is not currently accounted for. 

Second, evidence demonstrating the relatively fast degradation of char 
and lignin implies that all substrates are degradable within a suitable 
environment. Likewise, molecules that chemically resemble one another 
can exhibit very different residence times, depending on whether or not 
they are protected from decay. The way forward for global land models is 
to change their organizing principle from carbon pools with intrinsic 
decomposition rates (based on correlations with texture or litter quality, 
and modified by climate and land-use type) to more mechanistic repre- 
sentations of the stabilization processes that actually govern carbon 
dynamics and therefore the strength of climate feedbacks. 

Third, representing fine-scale processes and heterogeneity at the large 
scale of global models is a major challenge for the field. Box models, 
which have been the basis for modelling soil carbon, assume a mean 
behaviour at a specific spatial scale. Frequently, however, the distribu- 
tions of substrates, microorganisms and environmental conditions are 
highly skewed. For example, a soil that is on average at optimal soil water 
content for microbial activity will contain pockets that are too wet (such 


Table 1 | Representation of soil carbon in ecosystem models and recommendations for potential improvements 


Insight Properties of most published models 


Recommendations 


1. Molecular structure 


temperature as constant Qi9 above 0 °C. 
2. Humic substances 
decomposition and synthesis. 
3. Fire-derived carton 
represent decay of analogous substrates. 
4. Roots 


Decay rate of all pools keyed to substrate (or texture in 
CENTURY-type models*) and modified by moisture and 


Have a cascade of increasing intrinsic recalcitrance due to 


Do not include fire-residues as inputs or SOM. Do not 


Model decay rate as function of substrate properties and positions 
in microenvironment, microbial activity, and soil conditions 
including pH, temperature and moisture. See 4, 5, 6, 8. 

Replace the cascade with cycling of organic matter into and out of 
microbial biomass. See 1, 8. 

Add input pathway for fire-derived carbon. Add aromatic 
compounds to SOM types. 


5. Physical heterogeneity 


6. Soil depth 


7. Permafrost 


8. Soil micro-organisms 


Parameterize litter quality with leaf/needle chemistry. Have 
simplified root and dissolved organic carbon inputs. 

Lack physical processes, such as aggregation (some have 
tillage factor), spatial heterogeneity, or processes that would 
produce priming effectt. 

No change in processes or rate constants with depth of soil 
or carbon input. Site-level tuning required to reproduce long 
turnover times. 

Lack processes governing permafrost soil carbon cycling. 
Lack fully coupled methane biogeochemistry. 


Treat microbial biomass as pool of active carbon. Lack effects 
of microbial community or enzymes on rates and 
decomposition products. 


Use separate characterizations for below-ground and above- 
ground inputs. See 6. 

Non-normal probability distributions, density-dependent terms for 
organic matter and microbial biomass. Parameters from 3D, fine- 
resolution models. 

Representations of mineral associations, root and dissolved organic 
inputs, and physical disconnections. Explicit depth resolution for 
decomposition and transport. 

Add Oz limitation and freezing effects on CO2 and CH, production. 
Develop soil columns to represent inundation, permafrost thaw and 
thermokarst. 

Create and model microbial functional types, analogous to plant 
functional types. Introduce full soil nitrogen cycle coupled to carbon 
cycle. 


Shown are properties of the soil carbon component of published ecosystem models used for global change and carbon cycle analysis, and recommendations for potential improvements, grouped according to the 
eight insights developed in this Perspective (see ‘Recent insights into carbon cycling’ section). Globally implemented land models, such as Orchidee, LPJ, IBIS, CASA’ and CLM, are based on CENTURY, CN, or RothC 


soil models’*-78. 


* Some models use texture (clay content) to determine the amount of carbon in the slowest-cycling pool. However, soils with the same texture differ twofold in carbon stock and turnover time owing to differences in 
mineralogy, for example®?. One improvement would be to replace texture with reactive iron and aluminium or mineral surface charge density, estimated globally from a pedotransfer function. 
+ Priming effect means that carbon input rate has positive effect on decomposition rate. 
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as inside aggregates) as well as too dry (such as inside hydrophobic 
microsites). At the landscape scale, a small inundated area exerts a 
disproportionate effect on average methane emissions of a global model 
grid cell. Because decomposition responds nonlinearly to its drivers, 
average environmental conditions are imprecise predictors of the per- 
turbed system, and models that can represent this spatial dimension, 
from the micro-site to the profile to the landscape, are likely to fare 
better than current models. Advances in numerical methods make pos- 
sible more mechanistic treatments of transport, taxa-specific microbial 
requirements and other three-dimensional phenomena”. 

Moving forward requires identifying which parameters are critical, 
developing practical representations of these processes and parameters, 
and testing predictions of SOM dynamics against observations at the 
relevant scales (‘ground-truthing’). One reason that models have not 
incorporated recent scientific developments is the lack of appropriate 
data for parameterization and testing. The advanced analytical tech- 
niques and long-term experiments called for above are vital to fill this 
gap. In addition, more effort is needed both by modellers and empirical 
scientists to facilitate model evaluation. For example, the MC content of 
respired CO, or leached dissolved organic carbon is a powerful con- 
straint on underlying mechanisms of changes in stocks. In addition, '“C 
‘clocks’ the time carbon has spent in the ecosystem, and is the only way 
to quantify carbon residence time in undisturbed systems. Yet most 
ecosystem models lack a '*C tracer that would allow it to be used for 
testing. Clearly, a comprehensive database of '‘C measurements is 
needed. 


Join forces and connect research communities 

The cycling of organic matter is the subject of many different 
disciplines—from marine chemistry**”’ to low-temperature geology to 
archaeology’. Even within SOM research, there have been at least two 
separate and disconnected directions of research”*. There are those 
studying litter decomposition, with a focus on the biotic breakdown of 
plant inputs, often in forest organic litter layers or agricultural systems. 
On the other hand, there are those focused on organo-mineral interac- 
tions in the mineral soil’*. Other examples of separated research 
approaches include agronomic versus ecological questions, aquatic 
versus terrestrial environments, and laboratory versus field-based 
experiments. Cross-fertilization is especially needed between empirical 
scientists and modellers in the context of global change. Insights into 
mechanisms and observational data will improve predictions, and, in 
return, the needs of models will motivate useful experiments. 

More generally, though, the major advances in our understanding of 
soils will come from research grounded in the theory of many disciplines 
and in the practice of many approaches. The future research agenda for 
soils will integrate many different fields and have broader goals than it 
might have had in the past, with longer time horizons, wider spatial 
coverage, and an imperative to connect carbon, water and nutrient cycles, 
so as to understand the soil-plant system as a crucial part of our biosphere. 
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Common diseases are often complex because they are genetically heterogeneous, with many different genetic defects 
giving rise to clinically indistinguishable phenotypes. This has been amply documented for early-onset cognitive 
impairment, or intellectual disability, one of the most complex disorders known and a very important health care 
problem worldwide. More than 90 different gene defects have been identified for X-chromosome-linked intellectual 
disability alone, but research into the more frequent autosomal forms of intellectual disability is still in its infancy. To 
expedite the molecular elucidation of autosomal-recessive intellectual disability, we have now performed homozygosity 
mapping, exon enrichment and next-generation sequencing in 136 consanguineous families with autosomal-recessive 
intellectual disability from Iran and elsewhere. This study, the largest published so far, has revealed additional mutations 
in 23 genes previously implicated in intellectual disability or related neurological disorders, as well as single, probably 
disease-causing variants in 50 novel candidate genes. Proteins encoded by several of these genes interact directly with 
products of known intellectual disability genes, and many are involved in fundamental cellular processes such as 
transcription and translation, cell-cycle control, energy metabolism and fatty-acid synthesis, which seem to be 


pivotal for normal brain development and function. 


Early-onset cognitive impairment, or intellectual disability, is an 
unresolved health care problem and an enormous socio-economic 
burden. Most severe forms of intellectual disability are due to 
chromosomal abnormalities or defects in specific genes. For many 
years, research into the genetic causes of intellectual disability and 
related disorders has focused on X-chromosome-linked intellectual 
disability (XLID). It has become clear, however, that X-linked forms 
account for only 10% of intellectual disability cases, which means that 
the vast majority of the underlying genetic defects must be autoso- 
mal’. For severe forms of intellectual disability, autosomal-dominant 
inheritance is rare because most affected individuals do not repro- 
duce, but recent observations suggest that in outbred Caucasian popu- 
lations, a significant portion of the sporadic cases may be due to 
dominant de novo mutations”. So far, relatively little is known about 
the role of autosomal recessive intellectual disability (ARID), because 
in Western societies, where most of the research takes place, its investi- 
gation has been hampered by infrequent parental consanguinity and 
small family sizes. 

In most Northern African countries, and also in the Near and 
Middle East, parental consanguinity and large families are common; 
for example, in Iran, 40% of the families are consanguineous and 
about two-thirds of the population is 30 years of age or younger. 


Since 2004, we have performed systematic array-based consanguinity 
mapping in 272 consanguineous Iranian families. In several dozen 
families, we have defined single linkage intervals and mapped the 
underlying gene defects*°, and by subsequent mutation screening of 
candidate genes from these intervals, we and others identified several 
novel ARID genes (for review see refs 1, 7). 

Recently, exome enrichment and next-generation sequencing have 
been introduced as a cost-effective and fast strategy for comprehensive 
mutation screening and disease-gene identification in the coding por- 
tion of the human genome*”®. To unravel the molecular basis of ARID 
in a systematic fashion, we have now used a related, but more targeted, 
approach. Instead of sequencing entire exomes in consanguineous 
families, we have focused on the exons from homozygous linkage 
intervals known to carry the genetic defect. Before sequencing, these 
exons were enriched by hybrid capture using custom-made oligonu- 
cleotide arrays as baits. All patients had cognitive impairment (mostly 
moderate or severe, see Supplementary Table 1), and in a subset of the 
families there were signs of autism spectrum disorder. More informa- 
tion about the families and their clinical features, quality controls 
performed to validate the sequence variants observed and to assess 
their pathogenicity, as well as other methodological details are pro- 
vided in Supplementary Information. 
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Mutations in known and novel intellectual disability genes 


In 115 out of 136 families studied, plausible causal defects were 
observed, and in 78 of these, a single, apparently disease-causing muta- 
tion could be identified (see Supplementary Fig. 1, Tables 1 and 2 and 
Supplementary Table 2). Twenty-eight protein-truncating changes 
were found, including frameshift, splice-site and nonsense mutations, 
as well as whole-exon deletions, plus several smaller in-frame deletions 
of varying size. In 26 families listed in Table 1, we identified known, 
mostly syndromic forms of ARID, including rare metabolic defects and 
storage disorders, such as an atypical form of Tay-Sachs’ disease and 
Sanfilippo’s syndrome (mucopolysaccharidosis IIb), as well as in- 
tellectual disability with congenital abnormalities, such as a Joubert- 
like syndrome resulting from AHII mutations, observed in two 
unrelated families. Two families were also found with allelic PRKCG 
mutations, implicated previously in spinocerebellar ataxia, and two 
families carried different allelic mutations in the SRD5A3 gene, assoc- 
iated with Kahrizi’s syndrome, a recently elucidated congenital glyco- 
sylation disorder’*””. 

Two mutations involving the adaptor protein complex 4 were 
observed, namely in the AP4M1 and AP4EI genes, which encode 
different AP-4 subunits. AP-4 is involved in the recognition and 
sorting of cargo protein transported from the trans-Golgi network 
to the endosomal-lysosomal system. Another possibly pathogenic 
change was found in the AP4B1 gene, but its effect may be obscured 
by a PEX6 mutation in the same family, which causes a severe peroxi- 
some biosynthesis disorder!’ and probably accounts for most of the 
clinical features. In highly inbred families, coexistence of two different 
recessive defects is not unexpected and is the most plausible explana- 
tion for the complex phenotypes in at least two families with novel 
forms of ARID (M154 and M189, see Table 2). 

Mutations in the SLC2A1 gene, which encodes a glucose trans- 
porter, the PRKRA gene with a role in dysautonomia, and the 
MED13L gene, previously associated with intellectual disability and 
cardiac symptoms, were the only plausible causes of intellectual dis- 
ability in three families with non-syndromic intellectual disability. 
None of the respective families showed signs of dysautonomia or 
cardiac abnormalities. In all other families, the phenotype was char- 
acteristic for the molecular defect, including family M198 with folate 
receptor deficiency, a rare syndromic form of ARID that can often be 


treated by oral administration of folinic acid'*. Further details are 
provided in Table 1. 

Apparently pathogenic changes were also found in 50 genes that 
had not been previously implicated in ARID (see Table 2). Thirty of 
the relevant families had non-syndromic forms of intellectual disabil- 
ity, whereas 22 exhibited syndromic forms. Only two of the novel 
ARID genes were mutated in more than a single family. Two different 
missense mutations with high pathogenicity scores were detected in 
ZNF526, which encodes a kriippel-type zinc-finger protein. One of 
these changes was observed in DNA samples collected from two 
distinct families with non-syndromic intellectual disability, but closer 
inspection revealed that these families, which live in the same city in 
the northwestern part of Iran, share a common haplotype and thus 
must be distantly related. In these families, no other potentially 
disease-causing and co-segregating change could be identified. Zinc- 
finger proteins are transcriptional regulators, and other kritppel-type 
zinc-finger genes have been implicated in intellectual disability 
before’. Recent protein interaction studies have indicated a role for 
ZNF526 in promoting messenger RNA translation and cell growth 
(N. Hubner et al., personal communication). Another gene within 
which disease-causing mutations were found in two families was 
ELP2. It encodes a subunit of the RNA polymerase II elongator com- 
plex, which is a histone acetyltransferase component of RNA poly- 
merase II. This gene is involved in the acetylation of histones H3 and 
probably H4, and it may have a role in chromatin remodelling. 


Mutations affecting housekeeping genes 


In the LARP7 gene, we found a frameshift mutation in a family with 
intellectual disability and microcephaly. LARP7 is a negative tran- 
scriptional regulator of polymerase II genes, acting by means of the 
7SK RNP system. Within the 7SK RNP complex, the positive tran- 
scription elongation factor b (P-TEFb) is sequestered in an inactive 
form, preventing RNA polymerase II phosphorylation and sub- 
sequent transcriptional elongation. Hitherto, no disease association 
has been reported for LARP7. 

Presumably causative homozygous mutations were also found in 
KDM5SA and KDM6B. These genes encode histone demethylases that 
specifically demethylate histone H3 at lysine 4 and lysine 27, respect- 
ively, and they both have a central role in the histone code. We have 


Table 1 | Mutations identified in known genes for intellectual disability or related disorders 


Family Gene Mutation LOD score Length (Mb) OMIM no. Diagnosis, clinical features 
8500306 AHI1 R329X 2.65 10.35 608629 Joubert’s syndrome 3 
M332 AHI1 R495H 3.2 1.1 608629 Joubert’s syndrome 3 
M254 AP4E1 V454fs 2.5 357 607244 Microcephaly, paraplegia 
Mo04 AP4M1 E193K 1.9 6.75 602296 Microcephaly, paraplegia 
M324 BBS7 533del2aa 3.24 8.2 209900 Bardet-Biedl’s syndrome 
M107 CA8 R2370 2.4 4.02 613227 Ataxia, cerebellar hypoplasia 
M175 COL18A1 L1587fs 2. 9.8 267750 Knobloch’s syndrome (eye and brain development) 
G026 FAM126A Splice site* 24 5.46 610532 Hypomyelination-cataract 
M198 FOLR1 Splice site* 2, 6.95 136430 Folate receptor deficiency 
M165 HEXA C58Y 27 5.91 272800 Psychomotor delay, mild Tay-Sachs’ disease 
8600276+ L2HGDH R335X 5. 3.39 609584 Hydroxyglutaric aciduria 
M142 MED13L R1416H 1.9 9.17 608808 on-syndromic ID, no cardiac involvement 
8600486 NAGLU R565Q 28 3.25 252920 Sanfilippo’s syndrome, MPS IIIB 
8500234 PDHX R15H 3.13 35.17 245349 Pyruvate dehydrogenase defect 
331 PEX6 L534P 3.8 0.83 601498 Peroxisome biogenesis disorder 
8307998 PMM2 Y106F 2.67 6.71 212065 Glycosylation disorder CDG la 
8600273 PRKCG V177fs 253 0.72 605361 Spinocerebellar ataxia 14 
146 PRKCG D480Y 2A 745 605361 Spinocerebellar ataxia 14 
8600162 PRKRA S235T 2.1 40.02 612067 on-syndromic ID 
8600042 SLC2A1 V237M 3.73 16.7 606777 on-syndromic ID 
8700017 SRD5A3 Y169C 48 10.5 612713 Kahrizi’s syndrome, CDG 
069+ SRD5A3 A68fs 3.01 10.44 612713 Kahrizi’s syndrome, CDG 
GO008 SURF 1 W227R 18 4.59 185620 Leigh’s syndrome, very mild form 
8600041 TH R202H 21 7.23 605407 infantile parkinsonism, Segawa’s syndrome 
017 VRK1 R133C 3.4 3 607596 Pontocerebellar hypoplasia 
196 WDR62 G705G 2.1 18.33 600176 Microcephaly, cerebellar atrophy 
CDG, congenital disorder of glycosylation; fs, frameshift; ID, intellectual disability; LOD, logarithm of the odds; MPS, mucopolysaccharidosis; OMIM, Online Mendelian Inheritance in Man. 
* See Supplementary Information for further details. 
+ Remotely related, degree of consanguinity is not clear, analysis performed under conservative assumption of second degree consanguinity. 
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previously shown that mutations in another lysine-specific histone 
demethylase, KDM5C (also called JARID1C), are a relatively frequent 
cause of X-linked intellectual disability’®. In two other families, we 
observed apparently pathogenic mutations that involved histones 
directly: a frameshift mutation in the HIST1H4B gene which belongs 
to the histone 4 family, and a HIST3H3 missense mutation with high 
pathogenicity scores that was the only plausible change in a family 
with non-syndromic intellectual disability. Together, at least ten of the 
novel candidate genes for ARID involve histone structure, histone 
modification, chromatin remodelling or the regulation of transcrip- 
tion, and many of these genes are functionally linked to known and 
novel intellectual disability genes, as shown in Fig. la. 

Several other mutated genes are directly or indirectly involved in 
the regulation of translation. A homozygous frameshift mutation 
inactivating the TRMT1 gene was detected in a family with non- 
syndromic intellectual disability. TRMT1 is an RNA methyltrans- 
ferase that dimethylates a single guanine residue at position 26 of most 
tRNAs. Previously we and others have shown that inactivation of the 
X-linked gene FTSJ1, another RNA methyltransferase, also gives rise to 
non-syndromic intellectual disability'”"*, and we have recently iden- 
tified several ARID families with truncating mutations in a third RNA 
methyltransferase (L.A.M. et al., manuscript in preparation). A large 
deletion in the EEF1B2 gene was the only detectable defect in another 
family with non-syndromic intellectual disability. EEF1B2 encodes the 
elongation factor 1, which is involved in the transport of aminoacyl- 
tRNAs to the ribosomes. In yet another family with non-syndromic 
intellectual disability, a missense change was found in ADRA2B. This 
gene encodes a brain-expressed G-protein-coupled receptor that 
associates with EIF2B, a guanine exchange factor regulating trans- 
lation”; notably, ADRA2B also interacts with the 14-3-3 protein, 
which in turn associates with RGS7, another novel ARID gene product 
that regulates G-protein signalling. Finally, in a family with a syndromic 
form of intellectual disability, a missense change was found in the 
POLR3B gene, involving a nucleotide with a very high conservation score 
and predicted to be pathogenic by Mutation Taster”. POLR3B encodes 
the second-largest core component of RNA polymerase III, which 
synthesizes small RNAs such as tRNAs and 5S rRNAs” and also inter- 
acts with ENTPD1, the product of a novel candidate gene for intellectual 
disability (see GeneCards, http://www.genecards.org/cgi-bin/cardsearch. 
pl?search=POLR3B and Table 2). Together, these observations 
indicate that gene defects interfering with transcription and translation 
are particularly important causes of intellectual disability. 
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However, we also found pathogenic mutations affecting other fun- 
damental cellular functions and pathways such as cell-cycle control, as 
illustrated by a mutation inactivating CCNA2, and another one truncat- 
ing SCAPER, a specific regulator of the CCNA2-CDK2 complex (see 
Fig. 1b). The Cllorf46 gene encodes TTI2, a subunit of the Triple T 
complex, which is required for the establishment of cell-cycle checkpoints 
and for DNA-damage signalling”. Other mutations involved fatty-acid 
synthesis and turnover (ACBD6, FASN and PECR; see Table 2), protein 
degradation (UBR7), splicing (ZCCHC8) and cell migration (LAMA1). 


Intellectual disability genes with brain-specific functions 


Not surprisingly, several mutations involved genes with neuron- or 
brain-specific functions. For example, we found a frameshift mutation 
abolishing the function of CACNA1G, a T-type calcium channel with a 
critical role in the generation of GABAg receptor-mediated spike and 
wave discharges in the thalamocortical pathway~***. A nonsense muta- 
tion inactivated ZBTB40, which has a role in glia cell differentiation”’, 
and other observed changes are expected to interfere with the regu- 
lation of neurotransmission, exocytosis or neurotransmitter release. 
Our study also adds several novel intellectual-disability-associated 
genes to the Ras and Rho pathway (see Fig. 1c); for example, a convin- 
cing missense mutation in the RALGDS gene was the only variant 
detected in one family with non-syndromic intellectual disability. 
This gene encodes an effector of the Ras-related GTPase Ral, which 
stimulates the dissociation of GDP from the Ras-related RalA and RalB 
GTPases, thereby allowing GTP binding and activation of the 
GTPases”’. Regulators of small GTPases were among the first genes 
to be implicated in non-syndromic intellectual disability’”’*. We also 
found a homozygous frameshift mutation in CNKSR1, which is phys- 
ically associated with RALGDS. Homozygous carriers of this mutation 
have a severe syndromic phenotype with quadrupedal gait. CNKSR1 
binds to rhophilin (Online Mendelian Inheritance in Man (OMIM) 
601031), a Rho effector, suggesting that it acts as a scaffold protein and 
mediates crosstalk between the Ras and Rho GTPase signalling path- 
ways’. Neither RALGDS nor CNKSR1 had been implicated in intel- 
lectual disability so far; thus, both are novel ARID genes. 


Genes without obvious link to intellectual disability 

For several of the sequence variants, there is no obvious functional 
link between the molecular defect and intellectual disability. This 
applies to LINS1 and NDST1, and it is not easy to understand why 
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Figure 1 | Known and novel intellectual disability genes form protein and 
regulatory networks. a, Transcriptional/translational network. b, Cell-cycle- 
related network. c, Ras/Rho/PSD95 network. Connecting edges in the figure 
stand for protein-protein interactions. Arrows define direction of post- 
translational protein modifications: a, acetylation; ar, ADP-ribosylation; d, 


demethylation; da, deacetylation; dq, deubiquitination; m, methylation. Dotted 
lines indicate modulation of gene function. Data were obtained in part by using 
the INGENUITY software package (http://www.ingenuity.com) and by 
literature mining. More details about these proteins and their interactions are 
provided in Table 2 and in Supplementary Information. 
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Table 2 | Apparently causative variants in novel (candidate) genes for intellectual disability 


Family Phenotype Gene Mutation LOD score Length (Mb) Supporting evidence 

MOO8+ $ ACBD6 G22fs 2.65 6.46 P; binds long-chain acyl-CoA molecules, role in fatty acid synthesis or turnover**, 

M173 NS,ASD ADK H324R 5.1 9.68 S, P; only change in family. Adenosine kinase, regulates adenosine levels in the brain. 
Overexpression leads to learning impairment in mice*°; knockout mice develop lethal 
neonatal liver steatosis®°. In human, a different gene defect has been found in this 
condition. 

M266-2 NS ADRA2B R440G 2.53 24.97 S, P; GPCR regulating adrenergic neurons in the CNS. Associates with EIF2B, a GEF 
regulating translation!’. Also associates with 14-3-3, which interacts with RGS7, mutated 
in family 8700136. 

M226 NS ASCC3 $1564P 3.2 62.80 S, P, E; helicase that is part of the activating signal co-integrator complex, enhances NF-KB 
and AP1. Interacts with RARS2, implicated in pontocerebellar hypoplasia 6°°. 

MOO7Lt NS ASCL1 A41S 24 18.13 Encodes the bHLH factor MASH1, critical role in neuronal commitment and 
differentiation*”°. 

M182 NS C1lorf46 R236H 2.1 12.39 P, E; encodes subunit of the Triple T complex, role in regulation of DNA damage response??. 

Goo1 NS C12orf57 M1V 2.5 L119 S; function hitherto unknown. May overlap neighbouring ANT1 (DRPLA) gene (see UCSC 
Genome Browser, hg18; OMIM 125370). 

M100 NS C8orf41 P367L 3.3 6.44 S, P, E; C8orf41 associates with RUVBL2*°, which is involved in regulation of transcription 
and interacts with HDACs°°. 

GO15 NS C9orf86 A562P 3.3 217 P; encodes Rab-like GTP-binding protein PARF, which interacts with ARF (or CDKN2A). 
Other Rab has been implicated in 1D*°. 

8500031 S$ CACNA1G $1346fs 27 18.76 P, E; encodes a low-voltage-activated calcium channel which may also modulate the firing 
patterns of neurons?34, 

8600057 S CAPN10 138ins5aa 21 2.09 E; calcium-regulated non-lysosomal endopeptidase with a role in cytoskeletal remodelling 
and signal transduction, involved in long-term potentiation®?. 

8600495 NS CASP2 Q392X 2.5 29.62 P; caspase 2, role in apoptosis, abnormal in CASP2-deficient mice, particularly for motor 
and sympathetic neurons°*. Motor abnormalities not observed in family. 

M346 NS CCNA2 Splice site* 3.3 52.17 S, P; cyclin A2 is essential for cell cycle control°?. In mice, targeted deletion of this gene is 
lethal®*. Regulated by SCAPER, mutated in family 8600277. 

8500235t S CNKSR1 T282fs 2.53 5.83 P; regulates Raf in the MAPK pathway, acts as scaffold protein linking Ras and Rho signal 

ransduction pathways?°. Interacts with RALGDS, which is mutated in family 8500155. 

M144 NS CO0Q5 G118s 1.8 5.10 P, E; methyltransferase with pivotal role in coenzyme Q biosynthesis. Interacts with NAB2 
which controls length of poly(A) tail (see http://thebiogrid.org/35094/summary/ 
saccharomyces-cerevisiae/coq5.html). The human orthologue of NAB2 is implicated in 
ARID*. 

M178 NS EEF1B2 Splice site* 2.6 3.84 S,P,E; controls translation by transferring aminoacyl-tRNAs to the ribosome. Interacts with 
UNC51-like kinase 2 which is involved in axonal elongation translation®°°. 

GO17 NS ELP2 T555P 24 4.33 P, E; encodes subunit of the RNA polymerase II elongator complex°®. ELP3 subunit 
implicated in motor neuron degeneration. Allelic ELP2 mutation found in family 

8500061. 

8500061 NS ELP2 R462L 2/7 6.98 P, E; involved in transcriptional elongation, see also family GO17 with allelic ELP2 mutation. 

M263 NS ENTPD1 Y65C 2.65 212 P, E; ectonucleoside triphosphate diphosphohydrolase, expressed in CNS; knockout mice 
display abnormal synaptic transmitter release°”. 

MO50+ Ss ERLIN2 R36K 3.73 2.72 S, P,E; involved in the ER-associated degradation of inositol 1,4,5-triphosphate 
receptors°®. 

8500058 NS FASN R1819W 3.3 4.50 P; gene product synthesizes long-chain fatty acids from acetyl-CoA and malonyl-CoA. 
Expressed in post-synaptic density. In mice, FASN deficiency leads to embryonic 
lethality°?. 

M269 S FRY R1197X 2.8 12.68 P; regulates actin cytoskeleton, limits dendritic branching. In HeLa cells, FRY binds to 
microtubules and localizes on the spindle and is crucial for the alignment of mitotic 
chromosomes®. 

M251 S GON4L Splice site* 3.01 40.19 PE} cloned from brain. Encodes a transcription factor thought to function in cell cycle 
control?*. 

M189¢ S HIST1H4B K9fs 2.1 48.87 P, E; encodes a member of the histone H4 family; analogy to histone H3 mutation in family 
GO002. Ehlers—Danlos-related symptoms are probably due to TNXB mutation. 

G002 NS HIST3H3 R130C 2.53 26.74 P; role in spindle assembly and chromosome bi-orientation®*°*. See also family M189 
with HIST1H4B mutation. 

8500064 NS INPP4A D915fs 24 46.16 P, E; encodes inositol polyphosphate-4-phosphatase, only plausible change in family. 
Regulates localization of synaptic NMDA receptors, protects neurons from excitotoxic cell 
death®°. Knockout mice develop locomotor instability; not observed in this family. 

M061 S KDM5A R719G 23 6.06 P, E; encodes histone demethylase specific for Lys 4 of histone H3, role in transcriptional 


regulation®®. Other histone demethylase has been implicated in X-linked ID'®. See also 
family M8303971 with KDM6B mutation. 


in humans, adenosine kinase deficiency should lead to intellectual 
disability, whereas in the mouse, overexpression of Adk causes neuro- 
logical symptoms, and Adk deficiency gives rise to early lethal liver 
steatosis*®. Nothing is known yet about the function of the C12orf57 
gene, apart from its apparent overlap with ATNI1 (see UCSC Genome 
Browser, NCBI36/hg18). CAG trinucleotide expansion in the ATN1 
gene is the cause of dentatorubral pallidoluysian atrophy (DRPLA), 
another syndromic form of intellectual disability. A comprehensive 
list of families with single, probably disease-causing mutations is 
shown in Table 2. 

Despite exhaustive validation of our data and stringent filtering 
against all known neutral and pathogenic sequence variants (see 
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Supplementary Information and Supplementary Tables 3-6), it is still 
possible that not all of these changes will turn out to be causative. 
Particularly for the numerous missense mutations observed, func- 
tional studies will be required to rule out rare polymorphisms that 
are unrelated to intellectual disability. In a previous study, 1% of the 
protein-truncating mutations on the X chromosome were found to be 
unrelated to disease*’, and in our study, 12 observed inactivating 
mutations did not co-segregate with intellectual disability (see 
Supplementary Table 4). However, we believe that the vast majority 
of the changes presented here as probably pathogenic will be con- 
firmed, even if they have been observed only once, because most of the 
proteins encoded by these novel candidate genes interact with the 
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Family Phenotype Gene Mutation LOD score Length (Mb) Supporting evidence 

8303971 $ KDM6B P888S 3.1 5.08 S, P; demethylase 6B specifically targeting Lys 27 of histone H3, has a central role in 
regulation of posterior development by regulating HOX gene expression®’. Mutation of 
KDMBA gives rise to ID (see family M061). 

M154 s KIF7 E758K 21 7A6 P, E; knockout mouse model with complex picture involving brain and other neurological 
abnormalities. Stickler-like clinical features in this family can be explained by co-existing 
COL9A1 mutation. 

M183 S LAMA1 G1572fs 2.4 5.82 S, P; codes for subunit of laminin, role in attachment, migration and organization of cells 
during embryonic development. Required for normal retinal development in mice®?. 

G030 s LARP7 K276fs 1.93 8.94 S,P; encodes negative transcriptional regulator of polymerase II genes”°. 

7903104 S$ LINS1 H329fs 2.65 787 S, P; similar to lin, a Drosophila gene having important roles in the development of the 
epidermis and the hindgut. Link with ID unclear. 

8600060} NS MAN1B1 R334C 3.13 2.49 P, E; encodes mannosidase that targets misfolded glycoproteins for degradation. MAN1B1 
frameshift mutation observed in another ARID family by Canadian group (J. Vincent, 
personal communication). 

8600277 s NDST1 R709Q 24 10.18 S, P; only change in family. Encodes heparan N-deacetylase/N-sulphotransferase, 
deficiency is lethal in mice due to respiratory distress’!. No obvious link with ID. 

M158 S PARP1 L293F 1.8 16.76 P; poly(ADP-ribose) polymerase involved in histone 1 modification; role in memory 
stabilization in mice’. 

M194 S,ASD  PECR L57V 2.5 11.27 P; brain-expressed peroxisomal trans-2-enoyl-CoA reductase involved in the biosynthesis 
of unsaturated fatty acids’°. 

8401214 S$ POLR3B TI99K 1.93 24.89 E; second-largest core component of RNA polymerase III, which synthesizes small RNAs 
such as tRNAs and 5S rRNAs??. 

8500302 > PRMT10 G189R 2.65 9.75 P, E; protein arginine methyltransferase 10. Protein arginine methylation affects 
chromatin remodelling leading to transcriptional regulation, RNA processing, DNA repair 
and cell signalling”*. 

M010 s PRRT2 A214fs 52 25.59 P; interacts with SNAP25 which in turn assembles with syntaxin-1 and synaptobrevin to 
form exocytotic fusion complex in neurons°°. 

8500155 S RALGDS A706V 4.0 5.56 S, E; effector of Ras-related RalA and RalB GTPases, role in synaptic plasticity*®. Interacts 
with CNKSR1, inactivated in family 8500235. 

8700136 S,ASD  RGS7 N304fs 2,53 24.34 P; regulator of G protein signalling. Interacts with 14-3-3 protein, tau and snapin, a 
component of the SNARE complex required for synaptic vesicle docking and fusion’®. 
Indirectly linked with ADRA2B, mutated in family M266_2. 

8600086 Ss SCAPER Y118fs 3.9 17.45 S, E; interacts with CCNA2/CDK2 complex, transiently maintains CCNA2 in cytoplasm7°. 
CCNA2 is mutated in family M346. 

8600012 S SLC31A1 R90G 2 13.85 P, E; encodes one of two genes involved in copper import. Deficiency of the SLC31A1 
orthologue in mice is early lethal, heterozygotes have progressive neurological disorder’ ’, 
similar to patients in this family. 

i? s TAF2 W649R 2.1 19.16 P, E; TATA-box-associated gene is very important regulator of transcription (see OMIM 
604912). Other TAF genes have been implicated in X-linked ID (V.K. et a/., manuscript in 
preparation). MAL2 is another, less likely, candidate in this family. 

60 S TMEM135 C228S 2.4 16.89 S, P, E; transmembrane protein involved in fat metabolism and energy expenditure’®. 

300 NS TRMT1 230fs 3.4 10.34 P, E; encodes dimethylguanosine tRNA methyltransferase’”. At least two other RNA 
methyltransferases have been implicated in ID (ref. 17 and LA.M., manuscript in 
preparation). 

68 NS,ASD  UBR7 124S 2.5 8.78 P, E; encodes n-regognin 7,a component of E3 ubiquitin ligase®°. Involved in protein 
degradation, which has been implicated in ID. 

8500320 S WDR45L R109Q 1.93 2.55 P, E; WD repeat domain, phosphoinositide-interacting protein 3, ILF1-like®*, specific 
function unknown. 

69 S ZBTB40 Q525X 3.5 14.56 S, P, E; kruppel-type zinc finger, highly expressed in brain. Regulator of glia 
differentiation?®. 

56 NS ZCCHC8 L9OX 23 7.64 P; zinc-finger protein, identified in the spliceosome C complex. Interacts with BRCA1 and 
RBM7®2®3_ RBM10 has been implicated in X-linked ID (V.K. et a/., manuscript in 
preparation). 

025 NS ZNF526 R459Q 45 6.13 P; zinc-finger protein, only remaining change in family. Functional relevance supported by 
3D modelling. Probable activator of mRNA translation. Allelic ZNF526 mutation observed 
in family 8500156. 

8500156 NS ZNF526 Q539H 4.04 11.33 P; see family M025 with allelic ZNF526 mutation. 

References 44-83 are listed in Supplementary Information. E, high evolutionary conservation score; P, high pathogenicity score, includes truncating mutations; S, only change found in family. ASD, autism 

spectrum disorder; GPCR, G-protein-coupled receptor; ID, intellectual disability; NS, non-syndromic; S, syndromic. 

* See Supplementary Information for further details. 

Parents are distantly related. LOD scores provided are minimum estimates, calculated on the assumption that they are second cousins. 


¢In ethnically matching healthy controls a single heterozygous carrier was found (for details, see Supplementary Table 3). 


products of known or novel genes associated with intellectual disability, 


as shown in Fig. 1. 


Most ARID genes are not synapse specific 


Identification of most or all of these genes is a prerequisite for early 
diagnosis, prevention and, eventually, therapy of intellectual disability, 
but at the present pace, many years would be required to accomplish 
this task. Here, we have combined homozygosity mapping, targeted 


We have previously shown that ARID is an extremely heterogeneous 
disorder®. In contrast to non-syndromic hearing impairment or 
X-linked intellectual disability, common forms of ARID do not seem 
to exist, although there is evidence for regional clustering of the 
underlying gene defects*. Extrapolating from the number of known 
X-chromosomal intellectual disability genes argues for the involve- 
ment of several hundred genes in non-syndromic ARID, and the 
total number of ARID genes may well run into the thousands’. 


exon enrichment and next-generation sequencing to speed up the 
molecular elucidation of ARID. In 78 out of 136 consanguineous 
families investigated, we have found apparently pathogenic mutations 
in single genes. Fifty of these genes had not been implicated in ARID 
before, and only two of these novel intellectual disability genes were 
found to be mutated in two independent families. None of the ~10 
previously known genes for non-syndromic ARID, including those 
that were identified in Iranian families***°, was observed in our present 
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cohort, thereby corroborating previous evidence that ARID is extre- 
mely heterogeneous. 

Much of the research into the molecular causes of intellectual dis- 
ability has focused on the synapse and synapse-specific genes (for 
example, see refs 2, 37). In the present study, relatively few of the 
novel defects identified involve synapse- or neuron-specific genes, 
and they are vastly outnumbered by ubiquitously expressed genes 
with indispensable cellular functions, such as DNA transcription 
and translation, protein degradation, mRNA splicing, energy meta- 
bolism as well as fatty-acid synthesis and turnover. Many of these 
defects were found to be associated with non-syndromic ARID. It is 
not immediately clear why the clinical consequences of defects invol- 
ving such a wide spectrum of basic cellular processes should be con- 
fined to the brain, but this conceivably reflects the complexity of the 
central nervous system which may render it particularly vulnerable to 
damage. 

We expect that these findings will have direct implications for the 
diagnosis and prevention of intellectual disability, and perhaps also 
for autism, schizophrenia and epilepsy, which often co-exist in intel- 
lectual disability patients and are frequently associated with muta- 
tions in the same genes (for example, see ref. 38; reviewed in ref. 1). 
Further investigation of the novel genes and networks presented here 
should significantly deepen our insight into the pathogenesis of intel- 
lectual disability and related disorders. Moreover, this study illustrates 
the power of large-scale next-generation sequencing in families as a 
general strategy to shed light on the aetiology of complex disorders 
and on the function of the underlying genes. 

Note added in proof: While this work was in the press, two unrelated 
groups reported on inactivating ERLIN2 mutations in patients with 
recessive intellectual disability and progressive motor dysfunction”. 
Moreover, syndromic forms of intellectual disability have been described 
in patients with AP4B1 and AP4E1 (ref. 41) and MANIBI (ref. 42) 
mutations, respectively. Finally, mutations inactivating the KIF7 gene 
were identified as the cause of the recessive fetal hydrolethalus and 
acrocallosal syndromes that include brain malformations”. 


METHODS SUMMARY 


Most families studied were from Iran, and less than 10% had a Turkish or Arabic 
background. Wechsler Intelligence Scales for Children (WISC) and WAIS were 
used to assess the IQ in children and parents. Many of the pedigrees, as well as the 
methods used for autozygosity mapping, have been described previously. 

Exons from homozygous intervals were enriched with custom-made Agilent 
SureSelect DNA capture arrays and sequenced on an Illumina Genome Analyser 
I] yielding 76-bp single reads. >98% of the targeted exons were covered by at least 
four non-redundant sequence reads, each with a PHRED-like quality score of 20 
or above (mean, 0.984; median, 0.993; for details, see Supplementary Table 5). 

Toassess the reliability of this procedure for calling homozygous mutations, we 
looked up SNP markers from homozygous intervals of five selected families that 
had been analysed with high-resolution SNP arrays. For 773 out of 776 markers, 
next-generation sequencing and array-based SNP typing yielded identical results. 

To detect single nucleotide variants, high-quality reads were aligned to the 
human reference genome (hg18) by SOAP2.20 with default settings, typically 
gap-free. Homozygous exon-spanning deletions were assumed if the sequence 
coverage of the relevant exon(s) was reduced to <5% of the mean. Details about 
the detection of smaller deletions and insertions are provided in Methods. All 
variants were validated by high-resolution array CGH, Sanger sequencing, or both. 

Homozygous variants were filtered against dbSNP130/131, whole genomes 
from 185 healthy individuals studied by the 1000 Genomes Project and exomes 
from 200 Danish individuals, and found to be absent in at least 100 chromosomes 
from Iranian controls (see Supplementary Tables 1 and 3). To select and prioritize 
apparently disease-causing variants, various criteria were used (for more details, 
see Methods). All putative mutations co-segregated with intellectual disability in 
the respective families. 
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Frequent pathway mutations of splicing 
machinery in myelodysplasia 


Kenichi Yoshida'*, Masashi Sanada!*, Yuichi Shiraishi?*, Daniel Nowak**, Yasunobu Nagata’*, Ryo Yamamoto‘, Yusuke Sato!, 
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Myelodysplastic syndromes and related disorders (myelodysplasia) are a heterogeneous group of myeloid neoplasms 
showing deregulated blood cell production with evidence of myeloid dysplasia and a predisposition to acute myeloid 
leukaemia, whose pathogenesis is only incompletely understood. Here we report whole-exome sequencing of 29 
myelodysplasia specimens, which unexpectedly revealed novel pathway mutations involving multiple components of 
the RNA splicing machinery, including U2AF35, ZRSR2, SRSF2and SF3B1. Ina large series analysis, these splicing pathway 
mutations were frequent (~45 to ~85°%) in, and highly specific to, myeloid neoplasms showing features of myelodysplasia. 
Conspicuously, most of the mutations, which occurred in a mutually exclusive manner, affected genes involved in the 
3’-splice site recognition during pre-mRNA processing, inducing abnormal RNA splicing and compromised 
haematopoiesis. Our results provide the first evidence indicating that genetic alterations of the major splicing 
components could be involved in human pathogenesis, also implicating a novel therapeutic possibility for myelodysplasia. 


Myelodysplastic syndromes (MDS) and related disorders (myelodys- 
plasia) comprise a group of myeloid neoplasms characterized by 
deregulated, dysplastic blood cell production and a predisposition to 
acute myeloid leukaemia (AML)’. Although the prevalence of MDS has 
not been determined precisely, more than 10,000 people are estimated 
to develop myelodysplasia annually in the United States*. Their indol- 
ent clinical course before leukaemic transformation and ineffective 
haematopoiesis with evidence of myeloid dysplasia indicate a patho- 
genesis distinct from that involved in de novo AML. Currently, a 
number of gene mutations and cytogenetic changes have been impli- 
cated in the pathogenesis of MDS, including mutations of RAS, TP53 
and RUNX1, and more recently ASXL1, c-CBL, DNMT3A, IDH1/2, 
TET2 and EZH2 (ref. 3). Nevertheless, mutations of this set of genes 
do not fully explain the pathogenesis of MDS because they are also 
commonly found in other myeloid malignancies and roughly 20% of 
MDS cases have no known genetic changes (ref. 4 and unpublished 
data). In particular, the genetic alterations responsible for the dys- 
plastic phenotypes and ineffective haematopoiesis of myelodysplasia 
are poorly understood. Meanwhile, the recent development of mas- 
sively parallel sequencing technologies has provided an expanded 
opportunity to discover genetic changes across the entire genomes or 
protein-coding sequences in human cancers at a single-nucleotide 
level’-"°, which could be successfully applied to the genetic analysis 
of myelodysplasia to obtain a better understanding of its pathogenesis. 


Overview of genetic alterations 


In this study, we performed whole-exome sequencing of paired 
tumour/control DNA from 29 patients with myelodysplasia (Sup- 
plementary Table 1). Although incapable of detecting non-coding 
mutations and gene rearrangements, the whole-exome approach is 
a well-established strategy for obtaining comprehensive registries of 
protein-coding mutations at low cost and high performance. With a 
mean coverage of 133.8, 80.4% of the target sequences were analysed 
at more than X20 depth on average (Supplementary Fig. 1). All the 
candidates for somatic mutations (N = 497) generated through our 
data analysis pipeline were subjected to validation using Sanger 
sequencing (Supplementary Methods I and Supplementary Fig. 2). 
Finally, 268 non-synonymous somatic mutations were confirmed 
with an overall true positive rate of 53.9% (Supplementary Fig. 3), 
including 206 missense, 25 nonsense, and 10 splice site mutations, 
and 27 frameshift-causing insertions/deletions (indels) (Supplemen- 
tary Fig. 4). The mutation rate of 9.2 (0-21) per sample was signifi- 
cantly lower than that in solid tumours (16.2-302)”""" and multiple 
myeloma (32.4)°, but was comparable to that in AML (7.3-13)'** 
and chronic lymphocytic leukaemia (11.5)'®. Combined with the 
genomic copy number profile obtained by single nucleotide poly- 
morphism (SNP) array karyotyping, this array of somatic mutations 
provided a landscape of myelodysplasia genomes (Supplementary 
Fig. 5)!728, 
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Novel gene targets in myelodysplasia 

The list of the somatic mutations (Supplementary Table 2) included 
most of the known gene targets in myelodysplasia with similar muta- 
tion frequencies to those previously reported, indicating an acceptable 
sensitivity of the current study. The mutations of the known gene 
targets, however, accounted for only 12.3% of all detected mutations 
(N = 33), and the remaining 235 mutations involved previously un- 
reported genes. Among these, recurrently mutated genes in multiple 
cases are candidate targets of particular interest, for which high muta- 
tion rates are expected in general populations. In fact, 8 of the 12 
recurrently mutated genes were among the well-described gene targets 
in myelodysplasia (Supplementary Table 3). However, what immedi- 
ately drew our attention were the recurrent mutations involving 
U2AF35 (also known as U2AF1), ZRSR2 and SRSF2 (SC35), because 
they belong to the common pathway known as RNA splicing. Including 
an additional three genes mutated in single cases (SF3A1, SF3B1 and 
PRPF40B), six components of the splicing machinery were mutated in 
16 out of the 29 cases (55.2%) in a mutually exclusive manner (Fig. 1, 
Supplementary Fig. 6 and Supplementary Table 2). 


Frequent mutations in splicing machinery 


RNA splicing is accomplished by a well-ordered recruitment, rearrange- 
ment and/or disengagement of a set of small nuclear ribonucleoprotein 
(snRNP) complexes (U1, U2, and either U4/5/6 or U11/12), as well as 
many other protein components onto the pre-mRNAs. Notably, the 
mutated components of the spliceosome were all engaged in the initial 
steps of RNA splicing, except for PRPF40B, whose functions in RNA 
splicing are poorly defined. Making physical interactions with SFl anda 
serine/arginine-rich (SR) protein, such as SRSF1 or SRSF2, the U2 
auxiliary factor (U2AF) that consists of the U2AF65 (U2AF2)- 
U2AF35 heterodimer, is involved in the recognition of the 3’ splice site 
(3'SS) and its nearby polypyrimidine tract, which is thought to be 
required for the subsequent recruitment of the U2 snRNP, containing 
SF3A1 as well as SF3B1, to establish the splicing A complex (Fig. 1)”. 
ZRSR2 (or Urp), is another essential component of the splicing 
machinery. Showing a close structural similarity to U2AF35, ZRSR2 
physically interacts with U2AF65, as well as SRSF1 and SRSF2, with a 
distinct function from its homologue, U2AF35 (ref. 20). 

To confirm and extend the initial findings in the whole-exome 
sequencing, we studied mutations of the above six genes together with 
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Figure 1 | Components of the splicing E/A complex mutated in 
myelodysplasia. RNA splicing is initiated by the recruitment of U1 snRNP to 
the 5’SS. SF1 and the larger subunit of the U2 auxiliary factor (U2AF), U2AF65, 
bind the branch point sequence (BPS) and its downstream polypyrimidine 
tract, respectively. The smaller subunit of U2AF (U2AF35) binds to the AG 
dinucleotide of the 3’SS, interacting with both U2AF65 and a SR protein, such 
as SRSF2, through its UHM and RS domain, comprising the earliest splicing 
complex (E complex). ZRSR2 also interacts with U2AF and SR proteins to 
perform essential functions in RNA splicing. After the recognition of the 3’SS, 
U2 snRNP, together with SF3A1 and SF3B1, is recruited to the 3’SS to generate 
the splicing complex A. The mutated components in myelodysplasia are 
indicated by arrows. 
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three additional spliceosome-related genes, including U2AF65, SF1 
and SRSF1, in a large series of myeloid neoplasms (N = 582) using a 
high-throughput mutation screen of pooled DNA followed by con- 
firmation/identification of candidate mutations (refs 21 and 22 and 
Supplementary Methods I). 

In total, 219 mutations were identified in 209 out of the 582 specimens 
of myeloid neoplasms through validating 313 provisional positive events 
in the pooled DNA screen (Supplementary Tables 4 and 5). The muta- 
tions among four genes, U2AF35 (N= 37), SRSF2 (N= 56), ZRSR2 
(N = 23) and SF3B1 (N= 79), explained most of the mutations with 
much lower mutational rates for SF3A1 (N= 8), PRPF40B (N= 7), 
U2AF65 (N = 4) and SFI (N=5) (Fig. 2). Mutations of the splicing 
machinery were highly specific to diseases showing myelodysplastic fea- 
tures, including MDS either with (84.9%) or without (43.9%) increased 
ring sideroblasts, chronic myelomonocytic leukaemia (CMML) (54.5%), 
and therapy-related AML or AML with myelodysplasia-related changes 
(25.8%), but were rare in de novo AML (6.6%) and myeloproliferative 
neoplasms (MPN) (9.4%) (Fig. 3a). The mutually exclusive pattern of 
the mutations in these splicing pathway genes was confirmed in this 
large case series, suggesting a common impact of these mutations on 
RNA splicing and the pathogenesis of myelodysplasia (Fig. 3b). The 
frequencies of mutations showed significant differences across disease 
types. Surprisingly, SF3B1 mutations were found in the majority of the 
cases with MDS characterized by increased ring sideroblasts, that is, 
refractory anaemia with ring sideroblasts (RARS) (19/23 or 82.6%) and 
refractory cytopenia with multilineage dysplasia with = 15% ring side- 
roblasts (RCMD-RS) (38/50 or 76%) with much lower mutation fre- 
quencies in other myeloid neoplasms. RARS and RCMD-RS account 
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Figure 2 | Mutations of multiple components of the splicing machinery. 
Each mutation in the eight spliceosome components is shown with an 
arrowhead. Confirmed somatic mutations are discriminated by red arrows. 
Known domain structures are shown in coloured boxes as indicated. Mutations 
predicted as SNPs by MutationTaster (http://www.mutationtaster.org/) are 
indicated by asterisks. The number of each mutation is indicated in parenthesis. 
ZRSR2 mutations in females are shown in blue. 
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Figure 3 | Frequencies and distribution of 
spliceosome pathway gene mutations in myeloid 
neoplasms. a, Frequencies of spliceosome pathway 
mutations among 582 cases with various myeloid 
neoplasms. b, Distribution of mutations in eight 
spliceosome genes, where diagnosis of each sample 
is shown by indicated colours. 
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for 4.3% and 12.9% of MDS cases, respectively, where deregulated iron 
metabolism has been implicated in the development of refractory 
anaemia’. With such high mutation frequencies and specificity, the 
SF3B1 mutations were thought to be almost pathognomonic to these 
MDS subtypes characterized by increased ring sideroblasts, and 
strongly implicated in the pathogenesis of MDS in these categories. 
Less conspicuously but significantly, SRSF2 mutations were more fre- 
quent in CMML cases (Fig. 3 and Supplementary Table 4). Thus, 
although commonly involving the E/A splicing complexes, different 
mutations may still have different impacts on cell functions, contri- 
buting to the determination of discrete disease phenotypes. For 
example, studies have demonstrated that SRSF2 was also involved in 
the regulation of DNA stability and that depletion of SRSF2 can lead to 
genomic instability”*. Of interest in this context, regardless of disease 
subtypes, samples with SRSF2 mutations were shown to have signifi- 
cantly more mutations of other genes compared with U2AF35 muta- 
tions (P = 0.001, multiple regression analysis) (Supplementary Table 6 
and Supplementary Fig. 7). 

Notably, with a rare exception of A26V ina single case, the mutations 
of U2AF35 exclusively involved two highly conserved amino acid posi- 
tions (S34 or Q157) within the amino- and the carboxyl-terminal zinc 
finger motifs flanking the U2AF homology motif (UHM) domain. 
SRSF2 mutations exclusively occurred at P95 within an intervening 
sequence between the RNA recognition motif (RRM) and arginine/ 
serine-rich (RS) domains (Fig. 2 and Supplementary Figs 8 and 9). 
Similarly, SF3B1 mutations predominantly involved K700 and, to a 
lesser extent, K666, H662 and E622, which are also conserved across 
species (Fig. 2 and Supplementary Fig. 10). The involvement of recur- 
rent amino acid positions in these spliceosome genes strongly indicated 
a gain-of-function nature of these mutations, which has been a well- 
documented scenario in other oncogenic mutations”. On the other 
hand, the 23 mutations in ZRSR2 (Xp22.1) were widely distributed 
along the entire coding region (Fig. 2). Among these, 14 mutations were 
nonsense or frameshift changes, or involved splicing donor/acceptor 
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sites that caused either a premature truncation or a large structural 
change of the protein, leading to loss-of-function. Combined with their 
strong male preference for the mutation (14/14 cases), ZRSR2 most 
likely acts as a tumour suppressor gene with an X-linked recessive mode 
of genetic action. The remaining nine ZRSR2 mutations were missense 
changes and found in both males (six cases) and females (three cases), 
whose somatic origin was only confirmed in two cases. However, 
neither the dbSNP database (build131 and 132) nor the 1000 
Genomes database (May 2011 snp calls) contained these missense 
nucleotides, suggesting that many, if not all, of these missense changes 
are likely to represent functional somatic changes, especially those 
found in males. Interrogation of these hot spots for mutations in 
U2AF35 and SRSF2 found no mutations among lymphoid neoplasms, 
including acute lymphoblastic leukaemia (N = 24) or non-Hodgkin’s 
lymphoma (N = 87) (data not shown). 


RNA splicing and spliceosome mutations 


Because the splicing pathway mutations in myelodysplasia widely and 
specifically affect the major components of the splicing complexes 
E/A in a mutually exclusive manner, the common consequence of 
these mutations is logically the impaired recognition of 3’SSs that 
would lead to the production of aberrantly spliced mRNA species. To 
appreciate this and also to gain an insight into the biological/biochemical 
impact of these splicing mutations, we expressed the wild-type and the 
mutant (S34F) U2AF35 in HeLa cells using retrovirus-mediated gene 
transfer with enhanced green fluorescent protein (EGFP) marking 
(Fig. 4a and Supplementary Methods III) and examined their effects 
on gene expression in these cells using GeneChip Human genome 
U133 plus 2.0 arrays (Affymetrix), followed by gene set enrichment 
analysis (GSEA) (Supplementary Methods IV)’. Intriguingly, the 
GSEA disclosed a significant enrichment of the genes on the non- 
sense-mediated mRNA decay (NMD) pathway among the significantly 
upregulated genes in mutant U2AF35-transduced HeLa cells (Fig. 4b, 
Supplementary Fig. 1la and Supplementary Table 7), which was 
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confirmed by quantitative polymerase chain reactions (qPCR) (Fig. 4c 
and Supplementary Methods 5V). A similar result was also observed for 
the gene expression profile of an MDS-derived cell line (TF-1) trans- 
duced with the $34F mutant (Supplementary Figs 11b, c). The NMD 
activation by the mutant U2AF35 was suppressed significantly by the co- 
overexpression of the wild-type protein (Supplementary Fig. 11d), indi- 
cating that the effect of the mutant protein was likely to be mediated by 
inhibition of the functions of the wild-type protein. Given that the NUD 
pathway, known as mRNA surveillance, provides a post-transcriptional 
mechanism for recognizing and eliminating abnormal transcripts that 
prematurely terminate translation”, the result of the GSEA analyses 
indicated that the mutant U2AF35 induced abnormal RNA splicing in 
HeLa and TF-1 cells, leading to the generation of unspliced RNA species 
having a premature stop codon and induction of the NMD activity. 

To confirm this, we next performed whole transcriptome analysis in 
these cells using the GeneChip Human exon 1.0 ST Array 
(Affymetrix), in which we differentially tracked the behaviour of two 
discrete sets of probes showing different level of evidence of being 
exons, that is, “Core’ (authentic exons) and ‘non-Core’ (more likely 
introns) sets (Supplementary Methods IV and Supplementary Fig. 12). 
As shown in Fig. 4d, the Core and non-Core set probes were differ- 
entially enriched among probes showing significant difference in 
expression between wild-type and mutant-transduced cells (false dis- 
covery rate (FDR) = 0.01). The Core set probes were significantly 
enriched in those probes significantly downregulated in mutant 
U2AF35-transduced cells compared with wild-type U2AF35-trans- 
duced cells, whereas the non-Core set probes were enriched in those 
probes significantly upregulated in mutant U2AF35-transduced cells 
(Fig. 4e). The significant differential enrichment was also demon- 
strated, even when all probe sets were included (Fig. 4f). Moreover, 
the significantly differentially expressed Core set probes tended to be 
up- and downregulated in wild-type and mutant U2AF35-transduced 
cells compared with mock-transduced cells, respectively, and vice versa 
for the differentially expressed non-Core set probes (Fig. 4e). 
Combined, these exon array results indicated that the wild-type 
U2AF35 correctly promoted authentic RNA splicing, whereas the 
mutant U2AF35 inhibited this processes, rendering non-Core and 
therefore, more likely intronic sequences to remain unspliced. 

The abnormal splicing in mutant U2AF35-transduced cells was more 
directly demonstrated by sequencing mRNAs extracted from HeLa 
cells, in which expression of the wild-type and mutant (S34F) 
U2AF35 were induced by doxycycline. First, after adjusting by the total 
number of mapped reads, the wild-type U2AF35-transduced cells 
showed an increased read counts in the exon fraction, but reduced 
counts in other fractions, compared with mutant U2AF35-transduced 
cells (Fig. 4g). The reads from the mutant-transduced cells were 
mapped to broader genomic regions compared with those from the 
wild-type U2AF35-transduced cells, which were largely explained by 
non-exon reads (Fig. 4h). Finally, the number of those reads that 
encompassed the authentic exon/intron junctions was significantly 
increased in mutant U2AF35-transduced cells compared with wild-type 
U2AF35-transduced cells (Fig. 4i and Supplementary Methods VI). 
These results clearly demonstrated that failure of splicing ubiquitously 
occurred in mutant U2AF35-transduced cells. A typical example of 
abnormal splicing in mutant-transduced cells and the list of signifi- 
cantly unspliced exons are shown in Supplementary Fig. 13 and Sup- 
plementary Table 8, respectively. 


Biological consequence of U2AF35 mutations 

Finally, we examined the biological effects of compromised func- 
tions of the E/A splicing complexes. First, TF-1 and HeLa cells were 
transduced with lentivirus constructs expressing either the S34F 
U2AF35 mutant or wild-type U2AF35 under a tetracycline-inducible 
promoter (Fig. 5a and Supplementary Figs 14a and 15a), and cell 
proliferation was examined after the induction of their expres- 
sion. Unexpectedly, after the induction of gene expression with 
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Figure 4| Altered RNA splicing caused by a U2AF35 mutant. a, Western 
blot analyses showing expression of transduced wild-type or mutant (S34F) 
U2AF35 in HeLa cells used for the analyses of expression and exon microarrays. 
b, The GSEA demonstrating a significant enrichment of the set of 17 NMD 
pathway genes among significantly differentially expressed genes between wild- 
type and mutant U2AF35-transduced HeLa cells. The significance of the gene 
set was empirically determined by 1,000 gene-set permutations. c, The 
confirmation of the microarray analysis for the expression of nine genes that 
contributed to the core enrichment in the NMD gene set. Means + s.e. are 
provided for the indicated NMD genes. P values were determined by the Mann- 
Whitney U test. d, Significantly upregulated and downregulated probe sets 
(FDR = 0.01) in mutant U2AF35-transduced cells compared with wild-type 
U2AF35-transduced cells in triplicate exon array experiments are shown in a 
heat map. The origin of each probe set is depicted in the left lane, where red and 
green bars indicate the Core and non-Core sets, respectively. e, Pair-wise scatter 
plots of the normalized intensities of entire probe sets (grey) across different 
experiments. The Core and non-Core set probes that were significantly 
differentially expressed between the wild-type and mutant U2AF35-transduced 
cells are plotted in red and green, respectively. f, Distribution of the Core (red) 
and non-Core (green) probe sets within the entire probe sets ordered by splicing 
index (S.L; Supplementary Methods IV), calculated between wild-type and 
mutant U2AF35-transduced cells. In the right panel, the differential enrichment 
of both probe sets was confirmed by Mann-Whitney U test. g, Difference in 
read counts for the indicated fractions per 10° total reads in RNA sequencing 
between wild-type and mutant U2AF35-expressing HeLa cells analysis. 
Increased/decreased read counts in mutant U2AF35-expressing cells are 
plotted upward/downward, respectively. h, Comparison of the genome 
coverage by the indicated fractions in wild-type- and mutant-U2AF35- 
expressing cells. The genome coverage was calculated for each fraction within 
the 10® reads randomly selected from the total reads and averaged for ten 
independent selections. i, The odds ratio of the junction reads within the total 
mapped reads was calculated between the two experiments (red circle), which 
was evaluated against the 10,000 simulated values under the null hypothesis 
(histogram in blue). 


doxycycline, the mutant U2AF35-transduced cells, but not the wild- 
type U2AF35-transduced cells, showed reduced cell proliferation 
(Fig. 5b and Supplementary Fig. 15b) with a marked increase in the 
G2/M fraction (G2/M arrest) together with enhanced apoptosis as 
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Figure 5 | Functional analysis of mutant U2AF35. a, Expression of 
endogenous and exogenous U2AF35 transcripts in HeLa cells before and after 
induction determined by RNA sequencing. U2AF35 transcripts were 
differentially enumerated for endogenous and exogenous species, which were 
discriminated by the Flag sequence. b, Cell proliferation assays of U2AF35- 
transduced HeLa cells, where cell numbers were measured using cell-counting 
apparatus and are plotted as mean absorbance + s.d. c, The flow cytometry 
analysis of propidium iodide (PI)-stained HeLa cells transduced with the 
different U2AF35 constructs. Mean fractions = s.d. in GO/G1, S and G2/M 
populations after the induction of U2AF35 expression are plotted. d, Fractions of 
the annexin V-positive (AnnV +) populations among the 7-amino-actinomycin 
D (7AAD)-negative population before and after the induction of U2AF35 
expression are plotted as mean + s.d. for indicated samples. The significance of 
difference was determined by paired t-test. e, Competitive reconstitution assays 
for CD34-negative KSL cells transduced with indicated U2AF35 mutants. 
Chimaerism in the peripheral blood 6 weeks after transplantation are plotted as 
mean %EGFP-positive Ly5.1 cells + s.d., where outliers were excluded from the 
analysis. The significance of differences was evaluated by the Grubbs test with 
Bonferroni’s correction for multiple testing. *not significant. 


indicated by the increased sub-G1 fraction and annexin V-positive cells 
(Fig. 5c, d, Supplementary Fig. 14b and Supplementary Methods VI). To 
confirm the growth-suppressive effect of U2AF35 mutants in vitro, a 
highly purified haematopoietic stem cell population (CD34 c- 
Kit*Scal*Lin’, CD34 KSL) prepared from C57BL/6 (B6)-Ly5.1 
mouse bone marrow™® was retrovirally transduced with either the 
mutant (S34F, Q157P and Q157R) or wild-type U2AF35, or the mock 
constructs, each harbouring the EGFP marker gene (Supplementary Fig. 
16). The ability of these transduced cells to reconstitute the haemato- 
poietic system was tested in a competitive reconstitution assay. The 
transduced cells were mixed with whole bone marrow cells from 
B6-Ly5.1/5.2 Fl mice, transplanted into lethally irradiated B6-Ly5.2 
recipients, and peripheral blood chimaerism derived from EGFP- 
positive cells was assessed 6 weeks after transplantation by flow cytometry. 
We confirmed that each recipient mouse received comparable numbers 
of EGFP-positive cells among the different retrovirus groups by estim- 
ating the percentage of EGFP-positive cells and overall proliferation in 
transduced cells by ex vivo tracking. Also no significant difference was 
observed in their homing capacity to bone marrow as assessed by 
transwell migration assays (Supplementary Fig. 17). As shown in 
Fig. 5e, the wild-type U2AF35-transduced cells showed a slightly higher 
reconstitution capacity than the mock-transduced cells. On the other 
hand, the recipients of the cells transduced with the various U2AF35 
mutants showed significantly lower EGFP-positive cell chimaerism 
than those of either the mock- or the wild-type U2AF35-transduced 
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cells, indicating a compromised reconstitution capacity of the haema- 
topoietic stem/progenitor cells expressing the U2AF35 mutants. In 
summary, these mutants lead to loss-of-function of U2AF35 most 
probably by acting in a dominant-negative fashion to the wild-type 
protein. 


Discussion 


Our whole-exome sequencing study unexpectedly unmasked a com- 
plexity of novel pathway mutations found in approximately 45% to 
85% of myelodysplasia patients depending on the disease subtypes, 
which affected multiple but distinctive components of the splicing 
machinery and, as such, demonstrated the unquestionable power of 
massively parallel sequencing technologies in cancer research. 

The RNA splicing system comprises essential cellular machinery, 
through which eukaryotes can achieve successful transcription and 
guarantee the functional diversity of their protein species using 
alternative splicing in the face of a limited number of genes”. 
Accordingly, the meticulous regulation of this machinery should be 
indispensable for the maintenance of cellular homeostasis”, deregu- 
lation of which causes severe developmental abnormalities*'**. The 
current discovery of frequent mutations of the splicing pathway in 
myelodysplasia, therefore, represents another remarkable example 
that illustrates how cancer develops by targeting critical cellular func- 
tions. It also provides an intriguing insight into the mechanism of 
‘cancer specific’ alternative splicing, which have long been implicated 
in the development of cancer, including MDS and other haemato- 
poietic neoplasms****. 

In myelodysplasia, the major targets of spliceosome mutations 
seemed to be largely confined to the components of the E/A splicing 
complex, among others to SF3B1, SRSF2, U2AF35 and ZRSR2, and to 
a lesser extent, to SF3A1, SF1, U2AF65 and PRPF40B. The broad 
coverage of the wide spectrum of spliceosome components in our 
exome sequencing was likely to preclude frequent involvement of 
other components on this pathway (Supplementary Fig. 18). The 
surprising frequency and specificity of these mutations in this com- 
plex, together with the mutually exclusive manner they occurred, 
unequivocally indicate that the compromised function of the E/A 
complex is a hallmark of this unique category of myeloid neoplasms, 
playing a central role in the pathogenesis of myelodysplasia. The close 
relationship between the mutation types and unique disease subtypes 
also support their pivotal roles in MDS. 

Given the critical functions of the E/A splicing complex on the 
precise 3'SS recognition, the logical consequence of these relevant 
mutations would be the impaired splicing involving diverse RNA 
species. In fact, when expressed in HeLa cells, the mutant U2AF35 
induced global abnormalities of RNA splicing, leading to increased 
production of transcripts having unspliced intronic sequences. On the 
other hand, the functional link between the abnormal splicing of RNA 
species and the phenotype of myelodysplasia is still unclear. Mutant 
U2AF35 seemed to suppress cell growth/proliferation and induce 
apoptosis rather than confer a growth advantage or promote clonal 
selection. ZRSR2 knockdown in HeLa cells has been reported to also 
result in reduced viability, arguing for the common consequence of 
these pathway mutations. These observations suggested that the 
oncogenic actions of these splicing pathway mutations are distinct 
from what is expected for classical oncogenes, such as mutated kinases 
and signal transducers, but could be more related to cell differenti- 
ation. Of note in this regard, the commonest clinical presentation of 
MDS is severe cytopenia in multiple cell lineages due to ineffective 
haematopoiesis with increased apoptosis rather than unlimited cell 
proliferation’. In this regard, lessons may be learned from the recent 
findings on the pathogenesis of the 5q— syndrome, where haploin- 
sufficiency of RPS14 leads to increased apoptosis of erythroid pro- 
genitors, but not myeloproliferation’®”. 

A lot of issues remain to be answered, however, to establish the 
functional link between these splicing pathway mutations and the 
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pathogenesis of MDS, where the broad spectrum of RNA species 
affected by impaired splicing hampers identification of responsible 
gene targets. Moreover, the mutated components of the splicing 
machinery have distinct function of their own other than direct regu- 
lation of RNA splicing, involved in elongation and DNA stability, 
which may be important to determine specific disease phenotypes. 
Clearly, more studies are required to answer these questions through 
understanding of the molecular basis of their oncogenic actions. 


METHODS SUMMARY 


Whole-exome sequencing of paired tumour/normal DNA samples from the 29 
patients was performed after informed consent was obtained. SNP array-based 
copy number analysis was performed as previously described'”'*. Mutation ana- 
lysis of the splicing pathway genes in a set of 582 myeloid neoplasms were per- 
formed by first screening mutations in PCR-amplified pooled targets from 12 
individuals, followed by validation/identification of the candidate mutations 
within the corresponding 12 individuals by Sanger sequencing. Flag-tagged 
cDNAs of the wild-type and mutant U2AF35 were generated by in vitro muta- 
genesis, constructed into a murine stem cell virus-based retroviral vector as well as 
a tetracycline-inducible lentivirus-based expression vector, and used for gene 
transfer to CD34 KSL cells and cultured cell lines, with EGFP marking, respec- 
tively. Total RNA was extracted from wild-type or mutant U2AF35-transduced 
HeLa and TF-1 cells, and analysed on microarrays. RNA sequencing was per- 
formed according to the manufacturer’s instructions (Illumina). Cell proliferation 
assays (MTT assays) on HeLa and TF-1 cells stably transduced with lentivirus 
U2AF35 constructs were performed in the presence or absence of doxycycline. For 
competitive reconstitution assays, CD34 KSL cells collected from C57BL/6 (B6)- 
Ly5.1 mice were retrovirally transduced with various U2AF35 constructs with 
EGFP marking, and transplanted with competitor cells (B6-Ly5.1/5.2 Fl mouse 
origin) into lethally irradiated B6-Ly5.2 mice 48h after gene transduction. 
Frequency of EGFP-positive cells was assessed in peripheral blood by flow cyto- 
metry 6 weeks after the transplantation (Supplementary Methods VII). The primer 
sets used for validation of gene mutations and qPCR of NMD gene expression are 
listed in Supplementary Tables 9— 11. A complete description of the materials and 
methods is provided in the Supplementary Information. This study was approved 
by the ethics boards of the University of Tokyo, Munich Leukaemia Laboratory, 
University Hospital Mannheim, University of Tsukuba, Tokyo Metropolitan 
Ohtsuka Hospital and Chang Gung Memorial Hospital. Animal experiments were 
performed with approval of the Animal Experiment Committee of the University 
of Tokyo. 
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Human oocytes reprogram somatic cells 


to a pluripotent state 


Scott Noggle', Ho-Lim Fung’, Athurva Gore’, Hector Martinez’, Kathleen Crumm Satriani®*, Robert Prosser**, Kiboong Oum?*, 
Daniel Paull’, Sarah Druckenmiller!, Matthew Freeby”°, Ellen Greenberg*®, Kun Zhang”, Robin Goland*°, Mark V. Sauer, 


Rudolph L. Leibel*® & Dieter Egli! 


The exchange of the oocyte’s genome with the genome of a somatic cell, followed by the derivation of pluripotent stem 
cells, could enable the generation of specific cells affected in degenerative human diseases. Such cells, carrying the 
patient’s genome, might be useful for cell replacement. Here we report that the development of human oocytes after 
genome exchange arrests at late cleavage stages in association with transcriptional abnormalities. In contrast, if the 
oocyte genome is not removed and the somatic cell genome is merely added, the resultant triploid cells develop to the 
blastocyst stage. Stem cell lines derived from these blastocysts differentiate into cell types of all three germ layers, and a 
pluripotent gene expression program is established on the genome derived from the somatic cell. This result 
demonstrates the feasibility of reprogramming human cells using oocytes and identifies removal of the oocyte 
genome as the primary cause of developmental failure after genome exchange. 


The generation of animals by transfer of the genome from an adult cell 
into an unfertilized oocyte’, and the isolation of pluripotent stem cells 
from human blastocysts’, raised the prospect of generating stem cells 
with a patient’s genome. This prospect holds much medical promise 
as these patient-specific stem cells could be used to generate differ- 
entiated cells for cell replacement. Unfortunately, progress towards 
this goal has been slowed by legal and social considerations limiting 
the availability of human oocytes for research. Despite these limita- 
tions, several studies were conducted*!', but none have achieved the 
derivation of a stem cell line. Thus, the question of whether human 
oocytes have the ability to reprogram somatic cells to a pluripotent 
state has remained unanswered. 

Although it is now possible to induce pluripotent stem cell (iPS) 
formation by forced expression of transcription factors in somatic 
cells'’, differences between iPS- and blastocyst-derived stem cells have 
been reported for gene expression’*"*, DNA methylation’*"® and dif- 
ferentiation potential’’. In addition, reprogramming to iPS cells seems 
to compromise genomic integrity, introducing de novo mutations'* and 
copy number variations'®”°. Whether reprogramming using human 
oocytes yields pluripotent stem cells without these abnormalities 
remains to be determined. 

Various sources have been explored, including failed fertilized 
oocytes*, oocytes deemed in excess of clinical need*°”’, in vitro 
matured oocytes"® and fertilized oocytes”. Previously, we have found 
that very few women agree to donate their oocytes for research without 
payment". The majority of oocyte donors believe that payment should 
be provided regardless of whether the oocytes are used for research or 
reproductive purposes”*. Payment for reproductive oocyte donation is 
common in the USA with more than 8,000 donor in vitro fertilization 
(IVF) cycles performed annually™. Recognizing the varying views on 
payments to research oocyte donors, the American Society for 
Reproductive Medicine and the International Society for Stem Cell 
Research have proposed balanced guidelines**”° which allow payment 
at the discretion of research oversight committees, which must ensure 


that financial considerations do not constitute an undue inducement. 
Following on from those guidelines, we have developed protocols that 
were reviewed and approved by the institutional review board and stem 
cell committees of Columbia University. These protocols allowed 
women participating in the reproductive egg donation program to 
select between donation for reproductive purposes and donation for 
research, offering equal remuneration regardless of their choice. 
Consequently, the decision to donate was before and independent of 
their decision to donate for research. Our study of 270 mature human 
oocytes revealed that the exchange of the oocyte genome with the 
genome of a somatic cell consistently leads to developmental arrest. 
However, when the oocyte genome is not removed, and the somatic cell 
genome is merely added, the activated human oocytes develop to the 
blastocyst stage. Human stem cells derived from these blastocysts 
contain both a haploid genome derived from the oocyte and a diploid 
somatic cell genome reprogrammed to a pluripotent state. 


Results 

Development fails after somatic genome replacement 

In several mammalian species, somatic cell reprogramming has been 
achieved by replacing the oocyte genome at metaphase II (MII) of 
meiosis with a somatic cell nucleus (Fig. la). To remove the oocyte 
genome, we identified the location of the spindle-chromosome complex 
in 38/50 MII oocytes (Fig. 1b). All oocytes (43/43) survived genome 
removal with (31) or without (12) addition of Hoechst stain/minimal 
ultraviolet light exposure to verify enucleation. Oocytes lacking a 
genome were used for the transfer of somatic cell genomes obtained 
from skin cells of a male diabetic (T1D) and a healthy male adult. They 
were labelled with a green fluorescent protein (GFP) or a histone 2b 
(H2B):GFP transgene under the control of the ubiquitously expressed 
CAGGS promoter (Fig. 1c). Because Hoechst staining seemed to inhibit 
nuclear remodelling in rhesus oocytes”, we monitored chromo- 
some condensation every hour after transfer. All oocytes (35/35), 
whether or not they had been exposed to Hoechst, condensed the 
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Figure 1 | Developmental and transcriptional defects after genome 
exchange. a, Schematic of genome exchange in human oocytes. b, Human 
oocyte at the MII stage, viewed by microtubule birefringence. c, Donor cell 
population marked with either H2B:GFP or GFP. d, Somatic chromatin 3h 
after transfer. e, Timing of chromosome condensation. f, Developmental 
potential. Vertical axis is the percentage of activated eggs reaching specific 
developmental stages (horizontal axis). Days indicate the time points of normal 


somatic chromosomes (Fig. 1d, e); upon activation, 22/31 (71%) of the 
oocytes continued normal cleavage development. However, as we had 
previously observed following nuclear transfer into human zygotes”, 
development arrested at a stage of 6-10 cells (Fig. 1f, g). 

As a control for the quality of the oocytes, the development of in 
vitro fertilized donor oocytes was followed at the IVF clinic; 16/21 
(76%) developed to the blastocyst or morula stage by day 6 of culture, 
indicating excellent developmental potential (Supplementary Table 1). 
Likewise, artificially activated oocytes developed to the morula and 
blastocyst stages (13/52 or 25%), well beyond the point of develop- 
mental arrest seen after genome exchange (Fig. 1f). 

As a control for our experimental manipulations, we transferred 
H2B:GFP-labelled somatic nuclei without immediately removing the 
oocyte genome (Supplementary Fig. 1a). Six to eight hours after arti- 
ficial activation two interphase nuclei had formed within a single cell 
(Supplementary Fig. 1b). The H2B:GFP-labelled genome could be 
distinguished from the unlabelled oocyte genome, and either of them 
specifically extracted using contrast optics and GFP fluorescence 
(Supplementary Fig. 1c, e); Hoechst staining and ultraviolet illumina- 
tion was not required. Both types of cells therefore experienced the 
same manipulations, but differed in their ultimate genetic content, 
having either the oocyte genome or the somatic cell genome. 
Activated oocytes containing only the somatic genome formed 4-12 
cells, but all arrested without reactivating the GFP transgene (32/32) 
(Supplementary Fig. 1d). In contrast, activated oocytes containing 
only the oocyte genome cleaved and 4/7 (57%) developed to the 


Correlation coefficient 


0 


developmental progression. UV, ultraviolet light. g, Arrested development after 
somatic genome exchange. h, Development after spindle removal and re- 
transfer. i, Cluster diagram of global gene expression. *from ref. 22. j, Venn 
diagram of transcripts elevated in IVF samples on day 3-4 of development 
(black circle) in comparison to oocytes. The overlap with parthenotes, 
amanitin-treated samples and genome exchange samples are shown. 


blastocyst stage (Supplementary Fig. 1f), allowing the generation of 
pluripotent parthenogenetic stem cells (Supplementary Fig. 2). 

To test whether developmental potential was determined by the 
state of differentiation of the transferred genome, we replaced the 
oocyte genome with that of a blastomere. Upon activation, develop- 
ment to the morula and blastocyst stage occurred (Fig. 1f and 
Supplementary Fig. 3). Furthermore, following the removal of the 
oocyte’s genome and subsequent retransfer into the same oocyte, 
parthenogenetic development to the blastocyst stage was observed 
(Fig. 1f, h and Supplementary Fig. 4). 


Development arrests with transcriptional defects 

Extensive transcriptional activity from the zygotic genome normally 
starts at the 4-8 cell stage*®, or on day 3 of development, coincident with 
the stage of developmental arrest after somatic genome transfer (Fig. 1f). 
To determine whether the somatic genome was being expressed, we 
compared the transcriptome at the 6-12 cell stage after genome 
exchange, to the transcriptome after artificial activation on day 3/early 
day 4 of development. To distinguish between expression from the trans- 
ferred genome and maternal contributions to transcript abundance, we 
compared our data to those obtained after development of fertilized eggs 
from the 1-cell to the 6-8-cell stage in media containing the RNA poly- 
merase II inhibitor alpha amanitin”. Using hierarchical clustering of the 
global gene expression patterns, we found that after genome exchange, 
transcript types and abundances at the 6-12-cell stage most closely 
resembled a state of inhibition of transcriptional activity (Fig. 1i). 
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We then identified transcripts that were relatively upregulated in 
comparison to unfertilized oocytes. Using day 3 and day 4 IVF con- 
trols, we defined 761 transcripts upregulated at zygotic genome 
activation (more than fivefold, P<0.01). Of these 761 transcripts, 
only 124 (16%) were upregulated after genome exchange, and 62 
(8%) were upregulated after amanitin treatment (more than fivefold, 
P<0.01), possibly reflecting differential mRNA stability (Fig. 1j). In 
contrast, in parthenotes 536/761 transcripts (70%) (P< 0.001) were 
more than fivefold upregulated. 

To determine if the developmental arrest correlated with continued 
expression of somatic cell genes, we identified 1,406 genes that were 
expressed at higher levels in the skin donor fibroblasts than in MII 
oocytes (more than tenfold, P< 0.001 for four biological replicates). 
The average transcript levels of these genes after genome transfer was 
0.4-fold lower than in IVF controls. Therefore, neither the transcrip- 
tional program of a somatic cell nor that of a blastomere was being 
expressed. Among the few genes that were specifically elevated after 
genome exchange (Supplementary Table 2), one, GADD45G, is 
involved in stress-induced cell cycle arrest. 


The oocyte genome rescues development 

These developmental defects could be caused by an inability of the 
somatic cell genome to be appropriately expressed, replicated and/or 
segregated during cleavage development. Alternatively, molecules 
specific to the oocyte genome for which the somatic nucleus is unable 
to compensate may be removed during oocyte genome removal. 

To distinguish between these possibilities, we transferred a somatic 
cell genome but did not remove the oocyte genome (Fig. 2a). In contrast 
to previous experiments, development continued to the compacted 
morula stage, and expression of the CAGGS:GFP transgene was re- 
initiated at the appropriate stage (35/35 cleavage stages with four or 
more cells). Development to the blastocyst stage was efficient (13 
blastocysts of 63 transferred oocytes, or 21%), indicating that the somatic 
cell genome did not interfere with development to the blastocyst stage. 


Cell fusion 
somatic 4 Chromosome 
Mil oocyte cell condensation 


Derivation of pluripotent stem cells 

From these blastocysts, we isolated the inner cell mass and derived two 
cell lines, soPS1 (for somatic cell genome, oocyte genome pluripotent 
stem cell 1), containing the genome of a male T1D subject, and soPS2, 
containing the genome of a healthy male adult. Both cell lines were 
triploid (Fig. 3a), containing short tandem repeat (STR) alleles con- 
sistent with the presence of a diploid somatic cell genome and the 
haploid genome of the oocyte (Supplementary Tables 3 and 4). 
soPS1 contained an additional chromosome 17 of somatic donor cell 
origin and a balanced translocation between chromosomes 15 and 17 
(Supplementary Fig. 5a, b). At passage 23, 30% of soPS1 cells had 
gained additional copies of chromosome 12 and 17, chromosomal 
aberrations that commonly occur in pluripotent stem cell cultures 
because they confer a growth advantage”. soPS2 was karyotypically 
stable over more than 20 passages (Supplementary Material). During a 
period of 6 months, soPS1 and soPS2 completed more than 30 passages 
or over 100 population doublings (Supplementary Fig. 5c) without 
undergoing replicative crisis. Mitochondrial genomes were of oocyte 
donor origin without sign of heteroplasmy in either cell line (Fig. 3b and 
Supplementary Fig. 5d). Mitochondria transferred with the somatic 
nucleus may be outnumbered by the mitochondria of the oocyte, or 
they may be lost during cleavage development. 

Both soPS cell lines expressed molecular markers characteristic 
of pluripotent stem cells (Fig. 3c), and when differentiated in vitro, 
or following injection into immunocompromised mice, cell types 
representative of all three germ layers were observed (Fig. 3d). The 
global gene expression profile of both soPS cell lines clustered closely 
with that of other pluripotent cell types, including NYSCF1, a stem 
cell line derived from an IVF blastocyst. The parthenogenetic stem cell 
line, pPS1, and iPS cell lines derived from both skin cell donors also 
clustered closely with soPS cells, but the somatic donor cells clustered 
separately (Fig. 3e). We identified 1,327 genes that were differentially 
expressed between soPS2 and its donor fibroblast (P < 0.01). Of these 


e \\ Parthenote 
Somatic S 
genome removal 


Transfer only 


G » 
a) No enucleation 
—=> ———&qo 
Artificial 


I Transfer only (n = 63) 
Bi Parthenote (n = 7) 
[1 Genome exchange (n = 32) 


activation 
Genome 
=. Exchange 
genome removal 
c 
50 
ZGA 
—— = == 
mo 
£ ® 40 
raeee 
oe 
es 
8 o 30 
£5 
2 
= 
a3 
2 204 
oo 
ao 
“i 
=> 2 2 nay SF 
Oo, GN 30 = = ~~ 
> 2 Q > 
18 Op Op Be Se BO 
Ne ts TS No 22 a8 
oO ive) T+ Bp 
of ima) 


Figure 2 | Development after somatic cell genome transfer with retention of —_b, Developmental progression ‘transfer only’. Days post artificial activation are 


the oocyte genome. a, Schematic of somatic genome transfer without or with 
removal of either the oocyte or the somatic cell genome at the first interphase. 


72 | NATURE | VOL 478 | 6 OCTOBER 2011 


indicated. c, Developmental potential. Vertical axis is the percentage of 
activated eggs reaching specific developmental stages (horizontal axis). 


©2011 Macmillan Publishers Limited. All rights reserved 


a Karyotype soPS2 


Somatic & 


soPS2 
(passage 8) donor cell 
a 
@) 
a) 


Oocyte donor 
(skin cells) 


d Differentiation into three germ layers 


Pigmented 
epithelium 


Skin cell donor 1 
Skin cell donor 2 
Oocyte donor skin cell 


NYSCF1 | 
soPS2 
soPS1 


Porhenogenctc PS! i i i a sae 


vega 263 1 ELT bh aE Fo 


Skin cell donor 2 iPS 

Skin cell donor 1 iPS 
Figure 3 | soPS cells are pluripotent. a, Karyotype. b, Sequence of the 
mitochondrial hypervariable region I. c, Immunostaining for pluripotency 
markers. d, Immunostaining and histochemistry of cells/tissues upon 
differentiation. e, Cluster diagram of global gene expression analysis. Cell lines 


1,327 transcripts, 463 were present at fivefold higher levels in the stem 
cells than in the fibroblasts, and 670 transcripts were decreased by a 
factor of five or more in soPS2. Among the genes with the most signifi- 
cant upregulation in soPS1 and soPS2 were genes typically expressed in 
pluripotent stem cells, but not in fibroblasts, such as LIN28A, POU5F1, 
SOX2, NANOG and LEFTY] (Fig. 3f). Genes that were most down- 
regulated included those typically expressed in fibroblasts, such as fibro- 
blast activating protein (FAP), pappalysin (PAPPA), metallopeptidase 
(MMP3), a collagen triple helix-containing protein (CTHRC1), and a 
mesoderm-specific transcription factor (SNAI2) (Fig. 3f). In compar- 
ison to NYSCF1, 28 genes and 24 genes were expressed at higher levels 
in soPS2, and soPS1 respectively (more than threefold, P< 0.01) (Sup- 
plementary Table 5), including neuropeptide galanin, neuronatin, SRY, 
rex1 (also known as ZFP42), NODAL, Cerberus 1 and LEFTY2. 
Presumably, expression of these genes reflects spontaneous differenti- 
ation into various cellular lineages in soPS cultures, rather than incom- 
plete reprogramming of the somatic cell genome. In additional 
comparisons we were not able to identify consistent differences between 
soPS cells, iPS cells, NYSCF1 and pPS1 (Supplementary Fig. 6). 


Epigenetic reprogramming 

Consistent with reprogramming of the somatic cell genome to a 
pluripotent state, methylation of DNA at the Nanog promoter was 
low (5-15%) in soPS cells and high (38-58%) in the somatic donor cells 
(Fig. 4a). Demethylation at the Nanog and Oct4 promoters correlated 
with the expression of single nucleotide polymorphisms (SNPs) 
located in the somatic genome (Fig. 4b). Demethylation was specific 
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and did not occur on an imprinted locus PEG3: two thirds of sequen- 
cing reads were methylated in soPS1, reflecting the presence of two 
methylated maternal alleles, one from the oocyte and one from the 
somatic cell, as well as a single paternal allele of somatic origin (Fig. 4c). 

To determine whether reprogramming had occurred at other loci, we 
used a genome-wide digital allelotyping approach to distinguish gene 
expression from the somatic cell-derived genome and the oocyte-derived 
genome in soPS cells. This method is based on a library of 27,000 ‘pad- 
lock’ probes flanking known SNPs on all 23 chromosomes of the human 
genome (Fig. 4d)°*°. Extension of the padlock probes by DNA polymerase 
captures the SNP and allows single molecule DNA sequencing. SNP 
capture on genomic DNA will reflect the allelic ratio of the SNP, whereas 
SNP capture on cDNA will reflect the transcriptional activity ofan allele. 

We prepared genomic DNA and cDNA from soPS cells and their 
corresponding somatic cells (Supplementary Fig. 7). By generating 
108,982,981 sequencing reads (Supplementary Table 6), we were able 
to identify 787 and 483 expressed SNPs for soPS1 and soPS2, respec- 
tively, for which the oocyte donor DNA sequence differed from that of 
the somatic cell and for which the somatic cell was homozygous. The 
median allelic ratio (somatic/(oocyte plus somatic)) for the genomic 
DNA was 0.64 for both soPS1 and soPS2. This ratio is consistent with 
the inference that a diploid complement of 46,XY chromosomes ori- 
ginating from the somatic cell and a haploid set of 23,X chromosomes 
originating from the oocyte are present in the soPS cell lines. 

To calculate the proportion of transcripts expressed from the so- 
matic cell-derived genome in soPS cells (cDNA somatic/(cDNA oocyte 
plus cDNA somatic)), each individual SNP was normalized to the ratio 
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of the same SNP observed in genomic DNA. Ifa locus was expressed in 
proportion to its genomic content in soPS cells, the ratio of transcripts 
would be expected to equal 2/3 = 0.6667. Allele ratios for each SNP 
were binned in increments of 0.05 units, and that number of transcripts 
was expressed as a fraction of total SNPs captured (Fig. 4e). (For 
example, 162 somatic cell SNPs were expressed at a ratio of 0.65-0.7 
in soPS1. As the total number of SNPs captured was 787, the fraction is 
162/787 = 0.206, yielding the data point indicated by an asterisk in 
Fig. 4e). The median of the allelic ratio was 0.67 for soPS1 and 0.64 for 
soPS2, consistent with expression from the somatic genome propor- 
tional to the genomic content. The distribution of allelic ratios approxi- 
mated a Gaussian curve (Shapiro-Wilk test for normality W = 0.92 for 
soPS1 and W = 0.97 for soPS2), indicating significant variability in the 
contribution of a particular allele. Such variation was not specific to 
soPS cells. Variability at comparable levels was also observed for allelic 
ratios in diploid HUES6 as well as in tetraploid HUES6-somatic cell 
hybrids** (Supplementary Fig. 8), and may be caused by polymorph- 
isms in gene regulatory regions”®. 

Our expectation was that if reprogramming in soPS cells were 
incomplete we would detect a bias in this distribution: genes expressed 
at high levels in fibroblasts, but at low levels in pluripotent stem cells, 
would be expressed predominantly from the fibroblast cell-derived 
genome in soPS cells; conversely, genes that are expressed at high 
levels in pluripotent stem cells, but not in fibroblasts, would be 
expressed predominantly from the oocyte-derived genome. Among 
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a total of 1,019 genes represented in both gene expression array as well 
as in the SNP capture data, 38 (18 for soPS1 and 20 for soPS2) were 
upregulated at least fourfold in soPS cells (P< 0.01), and 69 (45 for 
soPS1 and 24 for soPS2) genes were expressed in the donor fibroblasts 
at levels fivefold or higher compared to soPS cells (P< 0.01). The 
mean allelic transcript ratio for the ‘somatic cell genes’ and the ‘plur- 
ipotent cell genes’, was identical, and did not differ from the expected 
allelic ratio of 0.66 nor from the allelic ratio of all captured genes 
(Fig. 4f and Supplementary Table 7). 


Discussion 


Upon replacement of the oocyte genome with that of a somatic cell, we 
observed developmental arrest at late cleavage stages in association 
with severe transcriptional abnormalities, similar to the arrest we 
had previously observed following somatic cell genome exchange in 
human zygotes”. These defects occurred despite the use of high quality 
oocytes obtained from women without history of infertility. This result 
is consistent with previous studies*””°, but contrasts with a report of 
efficient development to the blastocyst stage’. Those authors attributed 
the improved developmental potential to oocyte quality. However, 
because the STR genotype of the blastocysts was incomplete, an 
alternative interpretation is that the presence of genetic material of 
the oocyte promoted development to the blastocyst stage. Another 
group has generated a single blastocyst after transfer of pluripotent 
stem cell genomes’, suggesting that the developmental arrest seen with 
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somatic cells may not apply to pluripotent cells. Consistent with this 
hypothesis, we find that after transfer of blastomere or oocyte nuclei, 
development to the blastocyst stage occurs. 

In contrast, if the somatic cell genome is merely added and the 
oocyte genome is not removed, development to the blastocyst stage 
occurs. From these blastocysts, we were able to derive triploid plur- 
ipotent stem cells containing a diploid genome complement of the 
somatic cell and a haploid genome complement of the oocyte. 

It was previously shown that epigenetic memory is often retained in 
mouse iPS cells, but not in cells reprogrammed by mouse oocytes’. We 
compared gene expression from the haploid oocyte genome, which 
reached pluripotency through its developmental trajectory, with gene 
expression from the diploid somatic genome, which required a re- 
programming process to establish pluripotency. Because both of these 
genomes are present in the same cell, they are exposed to an identical 
environment. Preferential expression of pluripotency genes from the 
oocyte genome, or preferential expression of somatic genes from the 
somatic cell genome, would be a strong indication of epigenetic 
memory. Using a genome-wide allelotyping approach and gene expres- 
sion profiling, we compared the levels of expression from each allele, and 
did not find evidence for epigenetic memory: expression from the 
reprogrammed somatic genome was proportional to the genomic con- 
tent and did not depend on the activity in the fibroblast donor cell. 

This report demonstrates the feasibility of somatic cell reprogram- 
ming using human oocytes. With a reliable source of human oocytes, it 
should be possible to overcome the requirement of the oocyte genome 
for somatic cell reprogramming, allowing the generation of diploid 
pluripotent stem cells. 


METHODS SUMMARY 


Human oocytes were aspirated approximately 36h after human chorionic 
gonadotropin application and transported to the laboratory in a portable 
incubator at 37 °C. The oocyte genome of the MII oocyte was identified by micro- 
tubule birefringence and/or staining in Hoechst 33342 and minimal ultraviolet 
illumination, whereas in activated oocytes, the oocyte genome was identified by 
Hoffmann modulation contrast optics only and then removed by laser-assisted 
micromanipulation. Somatic cells were introduced into the oocyte. Oocytes were 
activated in the calcium ionophore ionomycin, followed by incubation in the kinase 
inhibitor 6-dimethylaminopurine, thoroughly washed, and allowed to develop. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Oocyte donation. Oocyte donors of age 22-33 were recruited from the women 
participating in the reproductive oocyte donation program at the Center for 
Women’s Reproductive Care (CWRC) at Columbia University P&S. These 
women had made a decision to enter the reproductive egg donation program, 
they met all criteria required for donation for reproductive purposes, and only 
then were presented with the option to donate oocytes for research. Both licensed 
medical social workers and CWRC physicians screened all women with respect to 
their reproductive, medical and psychosocial health. All of the women had a 
college degree or additional higher education, and none were financially dis- 
advantaged. All women in the study were fully employed. During a period of 
19 months, 16 women out of the 252 women enrolled in the reproductive oocyte 
donation program were asked if they wanted to donate oocytes to research. These 
women discussed the stem cell study in detail with a physician and those who 
chose to donate oocytes to research gave signed informed consent and initiated a 
standard hormone control regimen. All 16 women decided to participate in the 
study and gave informed consent (100% compliance). Two women did not com- 
plete the hormone treatment because of a lack of response. Two additional 
women donated for the study at a later time. In total, 16 women donated 270 
mature MII oocytes (range of 2-26, or a mean of 16.9 oocytes per donor cycle). 
Payment for participation was equal to payment for women donating oocytes for 
reproduction at CWRC, or $8,000 (pre-tax). 

Skin biopsies. Skin biopsies (3 mm) were obtained using an AcuPunch biopsy Kit 
(Acuderm Inc.) from the locally anesthetized (1% Lidocaine HCl, Hospira, Inc.) 
upper arm or the upper leg. Biopsies were cut in 10-15 smaller pieces, placed in a 
six-well dish around a droplet of silicon grease, covered with a glass cover slip, and 
allowed to grow for 3-4 weeks in medium containing DMEM, 10% FBS, 1% Anti- 
Anti, nucleosides, GlutaMAX, B-mercaptoethanol and nonessential amino acids 
(all Invitrogen). In some instances, skin biopsies were obtained from subjects also 
donating oocytes. The identification numbers of the skin cell donors are 1-000 
(male, T1D used for generation of soPS1) and 1-016 (male, used for generation of 
soPS2), and 1-034 for an oocyte donor. Protocols for obtaining skin biopsies and 
for their use in reprogramming experiments were reviewed and approved by the 
institutional review board and stem cell committees of Columbia University. All 
subjects gave signed informed consent. 

Genome transfer into human oocytes and stem cell derivation. Oocytes were 
transported in GMOPSplus (Vitrolife) in a portable incubator (INC-RB1, 
CryoLogic) at 37 °C. The oocyte genome was identified by microtubule birefrin- 
gence using the Oosight imaging system, and/or staining in 2 4g ml * Hoechst 
33342 and minimal ultraviolet illumination. All manipulations were done on a 
Nikon TE2000-U equipped with Narishige micromanipulators and a Tokai hit 
heating plate. Somatic cells were infected with a vesicular stomatitis virus G 
protein (VSVG)-pseudotyped CAGGS:GFP or CAGGS:H2B-GFP retrovirus, 
sorted for GFP expression with a BD FACSAriallu, and grown to confluence to 
induce cell cycle exit. A single somatic cell was inserted below the zona pellucida 
of the oocyte using laser-assisted zona drilling (Hamilton Thorne) and intro- 
duced into the oocyte either by two fusion pulses of 20 pts width and 1.3 kV cm7! 
strength (LF201, NEPA Gene), in cell fusion medium 0.26 M mannitol, 0.1 mM 
MgSO4, 0.05% BSA, 0.5 mM HEPES, or by prior incubation of the somatic cell in 
inactivated Sendai virus HVJ-E (GenomeOne, Cosmo Bio), diluted with fusion 
buffer 1:5. For both fusion methods, the efficiency of fusion and oocyte survival 
was close to 100%. The first polar body was removed or ablated with two to three 
500-s laser pulses to avoid potential fusion to the oocyte. Oocytes were activated 
in 5 uM ionomycin (Sigma) in GMOPs plus for 5 min, followed by 4-5 h incuba- 
tion in 2 mM 6-DMAP (Sigma) or until small interphase nuclei became apparent, 
thoroughly washed in Global medium and cultured to the blastocyst stage at 37 °C 
(Minc incubator, Cook), in a certified gas mixture 5% O2, 6%CO2, 89%N2 
(TechAir). Some samples were harvested on day 3-5 of development for gene 
expression analysis. Blastocysts were used for derivation of pluripotent stem cells 
as described’’, with the addition of 211M Thiazovivin (Stemgent) and 10 1M 
Rock-Inhibitor Y-27632 (Stemgent) to the derivation medium. Human pluripo- 
tent stem cells were expanded manually or enzymatically and cultured under 
standard conditions, as previously described**. Human blastocysts for the deriva- 
tion of NYSCF1 and human cleavage stages for gene expression analysis were 
thawed using the Sidney IVF thawing kit (K-SITS-5000, Cook Medical). Human 
blastocysts and cleavage stages were obtained from anonymous donors at CWRC 
under protocols reviewed and approved by the Columbia stem cell committee 
and the Columbia IRB. NYSCF1 characterization is described in Supplementary 
Fig. 9. iPS cells were generated according to published protocols using VSVG- 
pseudotyped retroviruses*’. 

Gene expression analysis. RNA from human cleavage stages and blastocysts was 
isolated using a picopure RNA isolation kit (Arcturus), amplified by two rounds 
of T7 transcription using the total Prep RNA amplification kit (Illumina). RNA 


from cell lines was isolated using RNeasy Plus mini kit (Qiagen) and amplified 
with a single round of T7 transcription. Amplified biotin labelled RNA was 
hybridized to the Illumina HumanRef-8 v3 Expression BeadChips. Analysis 
was undertaken using GenomeStudio and Microsoft Excel programs as follows: 
data were normalized to the average signal. Background was subtracted. 
Unfertilized oocytes (two biological replicates consisting of five MII oocytes) were 
used as a reference point for all comparisons. Transcripts (1,345) were more than 
fivefold upregulated (P < 0.01) in IVF controls on day 3 (two biological replicates 
consisting of two specimens). Of these transcripts 761 were also elevated in IVF 
controls collected early on day 4 (two biological replicates consisting of 17 
specimen, more than fivefold, P< 0.01). These were defined as the ZGA tran- 
scripts. We then determined how many of these ZGA transcripts were also 
upregulated after genome exchange (three biological replicates, nine specimens 
of up to 12 blastomeres), after amanitin treatment (two biological replicates 
consisting of two specimens) (more than fivefold, P<0.01), and parthenotes 
(one sample consisting of four specimens). Data analysis for downregulated 
transcripts was done accordingly: 829 genes with transcript levels of 20% or less 
of those found in the oocyte were identified (P < 0.01). Among those 829 tran- 
scripts, we determined the number of transcripts that were also downregulated 
after genome exchange and after amanitin treatment (P < 0.01). Gene expression 
analysis of soPS cells was done with normalization to average and subtraction of 
background. 

All array data and additional details on samples and analysis are available on 
GEO under accession number GSE28024. 
Cell line analysis. For karyotype and STR analysis, live cultures were shipped to 
Cell Line Genetics (WI). RNA and DNA were isolated from cultures at passage 7 
to 11, using QlAamp DNA Mini Kit for SNP capture. Padlock probes (27,000) 
were prepared to capture expressed SNPs from genomic DNA and cDNAs. The 
captured SNPs were quantified using single-molecule DNA sequencing accord- 
ing to ref. 30. Immunohistochemistry was done using primary antibodies recog- 
nizing Tral-60 (MAB4360, Millipore), Tral-81 (MAB5381, Millipore), SSEA-4 
(MAB1435, R&D), SSEA-3 (MAB1434, R&D), Nanog (AF1997, R&D), Oct-4 
(09-0023, Stemgent), Sox2 (09-0024, Stemgent), MF20 (DSHB), AFP (Dako), 
beta III tubulin (Sigma T2200) at dilutions of 1:500 to 1:1,000. Secondary anti- 
bodies were conjugated with Alexa Fluor (Invitrogen). Alkaline phosphatase 
staining was done using an alkaline phosphatase substrate kit (Vector 
Laboratories). Teratomas were generated by subcutaneous injection into NSG 
mice (Jackson laboratories) and harvested after 10-15 weeks. Animal experi- 
mentation was approved by the Columbia IACUC. For Affymetrix SNP chip 
analysis, g NA was processed according to the GeneChip mapping 500K assay 
manual, and hybridized to a 250K Nsp Array according to the manufacturer’s 
instructions, SNP data analysis was done using Affymetrix Genotyping Console. 
Bisulphite conversion of DNA was done using the EpiTect Bisulfite Kit according 
to the manufacturer’s instructions. For bisulphite sequencing, PCR products were 
cloned into Topo TA vector (Invitrogen) and sequenced using M13R primer. 

Primers used in this study were as follows: Primer sequence 5’ to 3’: 
ATTTGTTTTTTGGGTAGTTAAAGGT and CCTAAACTCCCCTTCAAAAT 
CTATT, bisulphite sequencing of Oct4 (ref. 36); TGG TTA GGT TGG TTT 
TAA ATT TTT Gand AAC CCA CCC TTA TAA ATT CTC AAT TA, bisulphite 
sequencing of Nanog’; GGAAAGAAAATTTTTATAGGTAGGATAGT and 
AAACCCTAAACCTCCTAAACTAAATCTAA, bisulphite sequencing of PEG3 
(ref. 38); GGGGTCAGTGCCTCAATAAG and TTTGGTCTCCAGGTTTCAGG, 
sequencing of rs7221396; CAGTTTTACCCCCTTACCTTCA and ACACATGTT 
GCCACCAGAGA, sequencing of rs11652263; GGGTTAGAAGCTCCTGCAAA 
and CTCTGGTCTGTCACCCATCA, sequencing of rs2286336; CAGGTTGACA 
GACATGAAATCC and CTIGCTTTTTCCTGCCATTGT, sequencing of 
rs10521202; CACCATTAGCACCCAAAGCT and TGATTTCACGGAGGA 
TGGTG, sequencing of mitochondrial hypervariable region I. 
Images and settings. GFP fluorescence and bright field images were aquired with 
a NikonTE 2000-U microscope equipped with a Nikon Digital Sight DS-QilMc 
camera and NIS elements AR imaging program. GFP fluorescence was acquired 
with 1-s exposure time, no gain for all images. Bright field images were contrast- 
adjusted (equally across the entire image) in Adobe Photoshop. Histology ana- 
lysis was done using an Olympus IX-71 microscope equipped with a U-TV0.5XC- 
3 colour camera. Immunofluorescence was done using Olympus DP30BW cam- 
era and Olympus imaging acquisition software. All fluorescent images are pseudo 
colours. Figures were assembled in Adobe Freehand MX. 
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Complement factor H binds 
malondialdehyde epitopes and protects 
from oxidative stress 


David Weismann’”, Karsten Hartvigsen'*’, Nadine Lauer*, Keiryn L. Bennett’, Hendrik P. N. Scholl°, Peter Charbel Issa®, 
Marisol Cano”, Hubert Brandstiitter””’, Sotirios Tsimikas*, Christine Skerka*, Giulio Superti-Furga', James T. Handa”, 


Peter F. Zipfel*, Joseph L. Witztum? & Christoph J. Binder’? 


Oxidative stress and enhanced lipid peroxidation are linked to many chronic inflammatory diseases, including 
age-related macular degeneration (AMD). AMD is the leading cause of blindness in Western societies, but its 
aetiology remains largely unknown. Malondialdehyde (MDA) is a common lipid peroxidation product that 
accumulates in many pathophysiological processes, including AMD. Here we identify complement factor H (CFH) as a 
major MDA-binding protein that can block both the uptake of MDA-modified proteins by macrophages and 
MDA-induced proinflammatory effects in vivo in mice. The CFH polymorphism H402, which is strongly associated 
with AMD, markedly reduces the ability of CFH to bind MDA, indicating a causal link to disease aetiology. Our 
findings provide important mechanistic insights into innate immune responses to oxidative stress, which may be 
exploited in the prevention of and therapy for AMD and other chronic inflammatory diseases. 


Increased oxidative stress has been implicated in the pathogenesis of 
many different diseases’. As a consequence of oxidative stress, proteins, 
lipids and DNA can be damaged, often resulting in structural changes. 
For example, when membrane phospholipids undergo lipid peroxida- 
tion, MDA and other reactive decomposition products are generated’. 
These can in turn modify endogenous molecules, generating novel 
oxidation-specific epitopes (OSEs), which are also present on the sur- 
face of apoptotic cells and blebs released from them’. Many of these 
OSEs are recognized as danger signals by innate immune receptors’. 
Elucidating the molecular mechanisms by which oxidative damage 
challenges the immune system would pave the road for new diagnostic 
and therapeutic approaches in several pathologies. 

MDA and its condensation products are reliable markers for oxid- 
ative stress and have been associated with many disorders, including 
atherosclerosis’* and AMD, a degenerative disease affecting the retina 
that leads to irreversible vision loss*®. AMD is the most common cause 
of blindness in the elderly in Western societies’. A hallmark of 
developing AMD is the accumulation of extracellular deposits, termed 
drusen, which have been shown to contain MDA*. MDA-modified 
proteins are known to induce inflammatory responses and are recog- 
nized by innate immunity”*’. We recently demonstrated that OSEs in 
general are a major target of innate natural antibodies both in mice and 
humans and that ~15% of all immunoglobulin M (IgM) natural anti- 
bodies bound MDA-type adducts, suggesting a great need to defend 
against this specific modification'*. However, the abundance of MDA 
and the danger associated with it suggests that additional, evolutionary 
conserved innate defence mechanisms exist. 


CFH binds MDA modifications 


We used an unbiased proteomic approach to identify plasma proteins 
binding to MDA modifications. Because normal plasma contains high 


titres of MDA-specific natural antibodies'*, we purified MDA- 
binding proteins from plasma of atherosclerotic Rag’ Ldlr~’~ mice 
that lack immunoglobulins. Pooled plasma was incubated with beads 
coupled to either malondialdehyde-acetaldehyde (MAA)-modified or 
unmodified polylysine, respectively. MAA is an advanced MDA- 
lysine adduct the structure of which is shown in Supplementary Fig. 1 
(see also ref. 13). Bound proteins were eluted and identified by mass 
spectrometry. As many as 45 unique peptides were found exclusively in 
MAA-polylysine pull-downs, of which >55% could be attributed to 
CFH (Supplementary Fig. 2 and Supplementary Table 1). CFH is a 
major regulator of the complement system and protects host tissues 
from complement-mediated damage™*. Immunoblot analysis revealed 
the presence of CFH on MAA-coated beads but not on control beads 
(Fig. la). This finding was confirmed using human plasma (Fig. 1b). 
Interestingly, the anti-CFH antibody also detected lower molecular 
weight bands, which may represent CFH-related proteins (CFHRs) that 
share high sequence homology with CFH. 

Using enzyme-linked immunosorbent assay (ELISA), we demon- 
strated that CFH bound to MDA directly and independently of the 
protein carrying the adducts. Purified CFH bound in a calcium- 
independent manner to both MAA-modified low density lipoprotein 
(MAA-LDL) and MAA-modified bovine serum albumin (MAA-BSA), 
but not to unmodified proteins (Fig. 1c and Supplementary Fig. 3). 
Moreover, we tested binding of CFH to the oxidation-specific modifi- 
cations phosphocholine-BSA (PC-BSA), which is bound by C reactive 
protein (CRP), as well as carboxyethylpyrrole-BSA (CEP-BSA) and 
4-hydroxynonenal-BSA (4-HNE-BSA). None of these modifications 
were bound by CFH (Fig. 1d and Supplementary Fig. 4A, B). CRP and 
C3 were also detected in the MAA-polylysine pull-downs (Supplemen- 
tary Table 1), but neither of them bound to coated MAA-BSA (Fig. 1d 
and Supplementary Fig. 4C). 


1Center for Molecular Medicine (CeMM) of the Austrian Academy of Sciences, 1090 Vienna, Austria. 7Department of Laboratory Medicine, Medical University of Vienna, 1090 Vienna, Austria. 7Department of 
Medicine, University of California at San Diego, La Jolla, California 92093, USA. “Leibniz Institute for Natural Product Research and Infection Biology, Hans Knill Institute and Friedrich Schiller University, 
07745 Jena, Germany. °Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, Maryland 21287, USA. °Nuffield Laboratory of Ophthalmology, University of Oxford, Oxford OX3 9DU, 


UK. 7Octapharma PPGmbH, Research & Development, 1100 Vienna, Austria. 


76 | NATURE | VOL 478 | 6 OCTOBER 2011 


©2011 Macmillan Publishers Limited. All rights reserved 


a e LDL @ MDA-LDL 
sy. . -OMAA-LDL # CuOx-LDL 
a 1.5 
a 
5 1.0 
= 
a 
$0.5 
x 
© 0.0 
0 25 50 75 100 
Competitor (ug ml) 
b f 
BSA -O-MAA-BSA 
250 < T 
130 Ae a 
95 21.0 
72 3 
805 
55 = 
ie 
© 0.0 
0 25 50 75 100 
©. 9 Competitor (ug mi’) 
£ 250,000 
8 
S, 200,000 ‘git Per 
7 150,000 = /eMAA-LDL CuOx-LDL 
% 100,000 Tae 
; Ys 
3 50,000 6 = 1s 
a £505 
L 0 So fo} 
o YY oF OF iH ~ 0.0 
ve FF ~ § 40 20 30 40 60 
5 
~s ® Competitor (ug ml) 
d GECFH MECRP h 
125,000 7 #MB24 4MB47 -eMDA2 
2 100,000 B15 
3a 
S 75,000 a a 1.0 
— So 
3 50,000 £505 
© 25,000 o8 
, ao 0.0 
a cae aa 0 10 20 30 40 50 
SF FS Antibody (ug ml") 
yr 
Ww 


Figure 1 | CFH specifically binds to MDA modifications. a, b, Immunoblot 
for CFH (molecular weight: 150 kDa) using eluates from either polylysine (PL) 
or MAA-polylysine (MAA-PL) beads incubated with Ldlr ‘Rag ‘~ mouse 
plasma (a) or human plasma (b). c, ELISA for binding of purified CFH 

(5 ug ml ') to coated native LDL, MAA-LDL, BSA and MAA-BSA. Values are 
mean + s.d. relative light units (RLU) per 100 ms of triplicate determinations. 
d, ELISA for binding of purified CFH or CRP (5 ug ml - ') to coated BSA, MAA- 
BSA and PC-BSA. Values are mean + s.d. RLU per 100 ms of triplicate 
determinations. e-h, Competition immunoassays. e, f, Binding of purified CFH 
(e) or binding of plasma CFH (f) to coated MAA-BSA in the presence of 
increasing concentrations of LDL, MDA-LDL, MAA-LDL and Cu** -oxidized 
(CuOx)-LDL, or BSA and MAA-BSA. g, Binding of biotinylated MDA-LDL to 
coated CFH in the presence of increasing concentrations of LDL, MDA-LDL, 
MAA-LDL and CuOx-LDL. h, Binding of biotinylated MDA-LDL to coated 
CFH in the presence of increasing concentrations of monoclonal antibodies 
specific for ApoB100 (MB24 and MB47) or MDA (MDA2). Data are expressed 
as a ratio of binding in the presence of competitor divided by the binding in the 
absence of competitor (B/Bo) and represent the mean + s.d. of triplicate 
determinations. As an estimate for the affinity, the dissociation constants Kg 
were calculated as 6.4 X 10 “moll”! for the binding of CFH to coated MAA- 
BSA and 1.6 X 10 ° moll’ for the binding of MAA-BSA to coated CFH. 


To characterize the specificity of the binding of CFH to MDA, we 
performed competition assays. Only MDA- and MAA-modified LDL 
competed in a concentration-dependent manner for the binding of 
CFH to coated MAA-BSA. Neither native LDL nor the negatively 
charged Cu**-oxidized LDL showed inhibition, thereby excluding 
non-specific interactions mediated by charge effects (Fig. le). To 
demonstrate a dose-dependent interaction in plasma, the binding of 
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CFH to coated MAA-BSA was tested in different plasma dilutions 
(Supplementary Fig. 5). Consistent with the notion that CFH is a 
major MDA-binding protein in plasma, binding of CFH to coated 
MAA-BSA was competed by soluble MAA-BSA with similar effi- 
ciency in whole plasma (Fig. 1f). In a reciprocal experiment, binding 
of biotinylated MDA-LDL to immobilized purified CFH was fully 
competed by either MDA- or MAA-modified LDL, even at very low 
competitor concentrations (Fig. 1g). In the same assay, the MDA- 
lysine-specific monoclonal antibody MDA2 fully inhibited binding 
of MDA-LDL to CFH. In contrast, the apoB-100-specific monoclonal 
antibodies MB47 and MB24, which bind apoB-100 of MDA-LDL 
(Supplementary Fig. 6), did not inhibit this interaction (Fig. 1h). 
Using surface plasmon resonance, we observed a concentration- 
dependent binding of CFH to coated MAA-BSA (Supplementary 
Fig. 7). Taken together, these findings prove that CFH binds speci- 
fically to MDA modifications. 


The CFH H402 variant has impaired MDA binding 


To map the binding site for MDA on CFH, we performed binding 
studies using recombinantly expressed CFH fragments (Fig. 2a). CFH 
is composed of 20 globular short consensus repeats (SCRs)'*. Only 
fragments containing either SCR7 or SCR20 bound to coated MDA 
(Fig. 2a). Reciprocally, soluble MDA-LDL only bound to immobilized 
fragments containing either SCR7 or SCR20, respectively (Sup- 
plementary Fig. 8). Importantly, these domains have also been iden- 
tified as clustering points of various disease-related mutations". 
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Figure 2 | The SCR7 domain of CFH is critical for MDA binding. a, ELISA 
for binding of CFH and recombinantly expressed CFH fragments to coated 
BSA (white bars) or MAA-BSA (black bars). The length of CFH fragments is 
indicated by schematic representations with each circle depicting one SCR. 
Values are mean + s.d. RLU per 100 ms of triplicate determinations. b, ELISA 
for binding of purified CFH variant Y402 and CFH variant H402 (both at 

1 pg ml’) to coated MAA-BSA. Values are mean + s.d. RLU per 100 ms of 
triplicate determinations. c, ELISA for binding of plasma CFH to coated MDA- 
LDL in plasma of subjects homozygous for the H402 risk allele (CC, n = 38), 
heterozygous for the H402 risk allele (CT, n = 88) or homozygous for the Y402 
allele (TT, n = 45). The association of rs1061170 with CFH binding to MDA 
was calculated with P = 1.29 “° using an additive model. Symbols represent 
individual subject samples with horizontal bars indicating the mean of each 
group. Values are mean + s.d. RLU per 100 ms of triplicate determinations 
(***P < 0,001). 


6 OCTOBER 2011 | VOL 478 | NATURE | 77 


©2011 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


One of the most widely studied single nucleotide polymorphisms 
(SNPs) in CFH is the prevalent rs1061170 SNP, which causes an amino 
acid switch on position 402 (YH) in SCR7. To determine the effect of 
the H402 substitution, we purified CFH from plasma of homozygous 
individuals expressing either CFH Y402 or CFH H402, respectively, 
and tested the binding to MDA. Compared to the common Y402 
variant, the CFH variant H402 exhibited significantly impaired bind- 
ing to MAA-BSA (Fig. 2b). The H402 variant has been associated with 
a significant risk for the development of AMD’*"’. Therefore, we 
analysed the binding of CFH to coated MDA-LDL in plasma samples 
of AMD patients with the respective genotypes. Compared to the 
extent of CFH binding to MDA-LDL using plasma of individuals 
homozygous for the protective allele, binding in plasma of heterozyg- 
ous subjects was reduced by 23% (P< 0.001), and by 52% (P< 0.001) 
in plasma of subjects homozygous for the H402 risk allele (Fig. 2c), 
irrespective of the total plasma CFH levels (Supplementary Fig. 9A). 
Moreover, plasma levels of MDA-specific IgG and IgM antibodies were 
similar in all groups (Supplementary Fig. 9B, C). 

The genetic deletion of CFHRI and CFHR3 has been reported to 
protect from AMD and could influence CFH binding to MDA”. Less 
than 25% of individuals in this study carried deletions at these loci and 
their removal from our analysis did not alter the significance of the 
association of rs1061170 with MDA binding (Supplementary Fig. 9D). 
Taken together, the impaired ability of the risk variant to bind MDA 
suggested an important role for this interaction in AMD pathogenesis. 


CFH binds cellular debris via MDA epitopes 


Owing to constant light exposure, the retina provides an environment 
that facilitates lipid peroxidation’. We detected MDA epitopes by 
immunohistochemistry in the eyes of subjects with and without 
AMD. MDA epitopes were detectable throughout the choroid and 
Bruch’s membrane (Fig. 3a, d). In eyes without AMD, labelling for 
MDA was stronger in the outer than inner Bruch’s membrane 
(Fig. 3a). In eyes with AMD, MDA staining was seen diffusely 
throughout Bruch’s membrane (Fig. 3d). Staining for CFH followed 
a similar pattern (Fig. 3b, e). In addition, strong CFH labelling was 
seen in the retinal pigment epithelium (RPE) and choriocapillaris 
basement membranes. Moreover, the presence of C3d, a cleavage 
product of iC3b, indicated co-factor activity at the same sites 
(Fig. 3c, f). We further demonstrated by confocal microscopy the 
presence of MDA epitopes on the surface of in vitro-generated necrotic 
RPE cells, a major cell type affected in AMD. Moreover, CFH co- 
localized with MDA epitopes, suggesting that MDA mediates recog- 
nition of dying cells by CFH (Fig. 3j). To demonstrate this directly, we 
used flow cytometry to assess the binding of CFH to apoptotic blebs 
from Jurkat T cells in the presence of MAA-BSA as competitor. 
Consistent with the presence of MDA epitopes on only a subgroup 
of apoptotic blebs, we found that CFH bound between 5-45% of apop- 
totic blebs (Supplementary Fig. 10). Importantly, MAA-BSA com- 
peted for this binding by more than 60%, whereas unmodified BSA 
did not (Fig. 3k). Thus, MDA adducts present in several retinal com- 
partments and on the surface of necrotic RPE cells represent in vivo 
ligands for CFH. In addition, we predicted that CFH might also bind to 
MDA adducts in other tissues and confirmed this in atherosclerotic 
lesions (Supplementary Fig. 11). 


CFH inactivates complement on MDA-bearing surfaces 

An important regulatory activity of CFH lies with its capacity to act as 
co-factor for serine protease factor I, thereby promoting the degrada- 
tion of C3b into inactive iC3b fragments. Deposition of iC3b on 
apoptotic cells increases their clearance in an anti-inflammatory man- 
ner’'**, We therefore tested whether CFH induces iC3b generation 
when bound to MDA. Indeed, CFH promoted the formation of iC3b 
in a dose- and time-dependent manner when bound to coated MAA- 
BSA (Supplementary Fig. 12). When comparing the co-factor activity 
of the 402 variants on MDA-decorated surfaces, we discovered 
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Figure 3 | CFH binds to MDA epitopes present in AMD lesions, on necrotic 
cells and apoptotic blebs. a-i, Immunohistochemistry of MDA (left), CFH 
(middle) and C3d (right) localization in human maculas. a-c, Eye of a 72-year- 
old subject heterozygous for the H402 SNP without AMD. d-f, Eye of a 93- 
year-old subject homozygous for the H402 SNP with AMD. g-i, IgG control 
immunostains. Red arrows indicate positive labelling of choriocapillaris 
basement membrane and arrowheads indicate labelling of Bruch’s membrane 
(BrM). Scale bar, 25 jim. Sections are representative for 7 donors (5 AMD, 2 
controls). j, Confocal immunofluorescent photograph of necrotic RPE cells 
stained with the MDA-specific IgM natural antibody EO14 (green) and CFH 
(red), respectively. The right panel shows a merged picture indicating co- 
localization of CFH binding with the presence of MDA epitopes (yellow). 

k, Competition assay for the binding of CFH to apoptotic blebs from Jurkat T 
cells either alone or in the presence of BSA or MAA-BSA assessed by flow 
cytometry. Values are expressed as mean + s.e.m. CFH binding (B/Bo) based on 
mean fluorescence intensities of four independent experiments (**P < 0.01). 


a strong functional difference in that impaired MDA-binding of 
the risk variant resulted in severely reduced factor-I-mediated C3 
cleavage (Fig. 4a). This activity of CFH may represent an important 
protective mechanism in conditions in which MDA is continuously 
generated, for example, on the surface of dying cells. 

Importantly, other members of the CFH family such as CFHRI1/ 
CFHR3 show homology with the carboxy terminus of CFH and 
therefore contain a potential MDA-binding site without possessing 
co-factor activity. Deletions of CFHR1/3 have been reported to be 
protective in AMD”°, suggesting a negative role of these proteins in 
this pathology. To demonstrate the potential capacity of MDA- 
binding CFHR to inhibit the beneficial co-factor activity of CFH, 
we tested whether C-terminal CFH fragments could compete for 
the co-factor activity by binding to MDA. Indeed, the MDA-binding 
fragment SCR18-20 prevented CFH from inducing iC3b generation, 
whereas a non-binding fragment containing SCR15-19 had no effect 
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Figure 4 | CFH inactivates complement on MDA-bearing surfaces. 

a, Immunoblot of C3b cleavage products induced by three concentrations of 
CFH variants bound to coated MAA-BSA. «41/43 indicates iC3b cleavage 
products of the C3b o-chain. B indicates the C3b B-chain that remains 
uncleaved and served as a loading control. «41/43 was densitometrically 
quantified, and data are presented as percentage of iC3b generation achieved 
with 5 pg ml | CFH Y402 (= 100%). Shown is the mean = s.e.m. of three 
independent experiments. b, Immunoblot of C3b degradation products 
induced by CFH bound to coated MAA-BSA in the presence of either CFH 18- 
20 or CFH 15-19. 


(Fig. 4b). These data point towards a complex regulation of com- 
plement activation on MDA-decorated surfaces. 


CFH neutralizes proinflammatory effects of MDA 


The inflammatory process in AMD lesions has been suggested to be 
propagated by the secretion of cytokines including IL-8 (ref. 23). 
Stimulation of RPE cells (ARPE-19) with MAA-BSA induced the 
expression of IL-8 and caused an antioxidant response as indicated by 
upregulation of NAD(P)H dehydrogenase and hemoxygenase-1 
(Fig. 5a). We then tested the effect of CFH on MAA-LDL binding to 
macrophages, another cell type involved in AMD pathogenesis™. In a 
cell-based ELISA, CFH inhibited binding of MAA-LDL to thioglycollate- 
elicited macrophages in a dose-dependent manner (Fig. 5b). This indi- 
cates that CFH binds the same epitope on MAA-LDL that is necessary 
for its recognition by macrophages. Similar to ARPE-19 cells, monocytic 
THP-1 cells exhibited a robust expression of IL-8 following MAA-BSA 
stimulation. In addition, MAA-BSA induced the expression of TNF-« 
and IL-1, but not IL-12B (Supplementary Fig. 13). Importantly, MAA- 
BSA-induced IL-8 secretion was inhibited by physiological concentra- 
tions of CFH in a dose-dependent manner (Fig. 5c). In contrast, CFH 
had no effect on IL-8 production induced by phorbol myristate acetate 
(Supplementary Fig. 14). 

To evaluate the importance of this interaction in vivo, we examined 
the effect of MAA adducts in a murine model. First we validated that 
MAA-BSA could induce secretion of KC, the mouse orthologue to 
human IL-8, in murine macrophages (Supplementary Fig. 15). To test 
the proinflammatory effect of MAA-BSA as well as the scavenging 
capacity of CFH in an AMD-relevant site, we performed intravitreal 
microinjections of MAA-BSA with or without CFH. After six hours 
mice were killed, RPE/choroid was isolated from each eye after enuc- 
leation, and RNA was extracted. The purity of the preparation was 
confirmed by the expression of the RPE-specific gene Rpe65 and the 
lack of rhodopsin (Rho) expression as a marker for the neurosensory 
retina in all samples (Fig. 5d). MAA-BSA injection led to a sevenfold 
upregulation of KC expression in these RPE preparations, whereas 
BSA injection had no effect. Importantly, addition of CFH completely 
inhibited the effect induced by MAA-BSA (Fig. 5e). Thus, MAA 
adducts promote inflammatory responses in different cell types 
involved in AMD in vitro and in the eye in vivo, and CFH specifically 
neutralizes this property. 
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Figure 5 | CFH neutralizes proinflammatory effects of MDA. a, Expression 
of indicated genes in ARPE-19 cells stimulated for 24h with 50 pg ml _' BSA 
compared to MAA-BSA as determined by quantitative RT-PCR. Data 
represent the mean + s.e.m. of three independent experiments. b, Cell-based 
ELISA for the binding of biotinylated MAA-LDL to thioglycollate-elicited 
macrophages in the presence of BSA or CFH. Data are expressed as B/By and 
represent mean = s.d. of triplicate determinations. c, Secretion of IL-8 by THP- 
1 cells stimulated for 12 h with BSA or MAA-BSA in the absence or presence of 
CFH. Numbers below indicate concentrations of CFH, BSA and MAA-BSA in 
ug ml’. Error bars represent mean + s.e.m. of three independent experiments. 
d, e, Intravitreal injection of BSA, MAA-BSA and/or CFH in mice (n = 4-5 per 
group). Six hours after injection, RPE/choroid was isolated. d, RT-PCR for 
Rpe65 and Rho in RPE/choroid fractions. cDNA isolated from neurosensory 
retina was used as a control (Ct). e, Expression of KC in RPE/choroid as 
assessed by quantitative RT-PCR. Error bars represent mean + s.e.m. 
expression normalized to the BSA-injected group (*P < 0.05, **P <0.01, 
P< 0.001). 


Discussion 

We report the identification of CFH as a hitherto unrecognized innate 
defence protein against MDA, which is a ubiquitously generated 
proinflammatory product of lipid peroxidation’’”°®. Our discovery 
of CFH as a major MDA-binding protein demonstrates that innate 
immunity has a pivotal role in providing homeostatic responses 
against endogenous oxidation-specific danger-associated molecular 
patterns*. This parallels the innate immune response to another 
OSE, the PC headgroup of oxidized phosphatidylcholine, mediated 
by the macrophage scavenger receptors CD36 and SR-B1, the murine 
IgM natural antibody EO6/T15 and the acute phase reactant CRP*”». 
In an analogous manner, MDA is recognized by macrophage scavenger 
receptor SR-A"', several germline IgM natural antibodies'* and—as we 
now demonstrate—by CFH. 

CFH is one of the most abundant plasma proteins (~100- 
700 pg ml ') and the major regulator of complement activation". It 
mediates anti-inflammatory housekeeping functions by protecting 
self cells from complement activation”, which is especially important 
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for dying cells that lose other surface-associated complement regula- 
tors**’’. A number of potential ligands for CFH on host cells have 
been studied, including glycosaminoglycans”, as well as annexin A2, 
DNA and histones on apoptotic cells”. We now identify MDA as a 
major ligand for CFH on apoptotic/necrotic cells and show that MDA 
epitopes provide a surface for CFH to allow local generation of anti- 
inflammatory iC3b fragments. This becomes relevant in situations 
when large amounts of cellular debris are generated. Of note, necrotic 
and, under certain conditions, apoptotic cells are proinflammatory 
per se*®**. In this regard, the interaction of CFH with MDA-modified 
cellular compounds is also important because CFH limits MDA- 
induced IL-8 secretion. This provides an explanation for the ability 
of CFH to reduce endothelial IL-8 secretion in response to apoptotic 
blebs**. Thus, MDA epitopes are responsible for the recruitment of 
CFH to the surface of apoptotic cells, where it neutralizes their pro- 
inflammatory properties and halts complement activation. 

We demonstrate that SCR7 and SCR20 mediate the binding of CFH 
to MDA. These two domains are clustering sites for mutations asso- 
ciated with AMD, but also other diseases'*. The most prominent 
example is the H402 exchange in SCR7, which has a frequency of 
35%, and may be responsible for over half of all AMD cases****. 
However, direct evidence for functional consequences of this poly- 
morphism remained elusive. Here we show that the H402 variant exhi- 
bits severely impaired binding to MDA in a gene-dosage-dependent 
manner, which correlates well with the H402-associated risk for devel- 
oping AMD. The H402 variant has been suggested to favour local 
complement activation as a result of reduced binding to glycosamino- 
glycans in the eye**. However, in contrast to glycosaminoglycans, MDA 
is enriched in the membranes of dying cells, which are continuously 
generated in the retina and need to be efficiently removed**. By demon- 
strating that the H402 variant has a reduced capacity to generate anti- 
inflammatory iC3b fragments on MDA-bearing surfaces, we provide a 
functional explanation for its strong disease association. It remains to 
be seen whether other genetic variations of CFH family members also 
affect MDA binding and thereby contribute to disease pathogenesis. 

Consistent with an earlier report, we found MDA epitopes 
throughout the choroid and Bruch’s membrane including drusen of 
AMD lesions*. However, even under physiological conditions, oxi- 
dized phospholipids are formed as a result of photic stimulation of 
retinal photoreceptors and subsequently scavenged by several pro- 
cesses including clearance via CD36 (ref. 37). As one of the major 
degradation products of peroxidized phospholipids, MDA is continu- 
ously generated. Several lipid peroxidation products, including MDA, 
can cause RPE damage”. Therefore, physiological housekeeping 
mechanisms are critically needed to prevent their accumulation and 
adverse reactions mediated by them. Our immunohistochemical 
results support a role for CFH, because CFH is found in the same 
locations as MDA in eyes with and without AMD. We show that 
MDA adducts, similar to what has been shown following the ingestion 
of oxidized photoreceptors”’, have the capacity to induce IL-8 secre- 
tion in RPE, which can be blocked by CFH. Increased IL-8 expression 
correlates with higher incidence of AMD, underlining its important 
pathogenic role”. Therefore, neutralization of MDA adducts by CFH 
has the potential to limit several pathogenic events in AMD. 

Undoubtedly, there are multiple defences against the ubiquitous 
MDA adducts. The described homeostatic response may be particu- 
larly limiting in the eye, as opposed to other sites where MDA adducts 
accumulate such as the vascular wall*°. Future studies need to evaluate 
the contribution of this newly found interaction in the pathogenesis of 
other inflammatory diseases. The findings described here may lead to 
novel approaches exploiting endogenous defence mechanisms for the 
prevention and therapy of chronic inflammation in general. 


METHODS SUMMARY 
Subjects and clinical diagnosis. The patient cohort used in this study has been 
described*". 
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Protein modifications. The MAA modifications of LDL, BSA or polylysine were 
performed as described’. 

Intravitreal injection. Intravitreal injections of BSA, MAA-BSA and/or CFH 
were performed in male and female C57BL/6 mice. Six hours later, the RPE/ 
choroid was isolated and expression of target genes assessed by quantitative 
RT-PCR. 

Bead coupling and pull-down procedure. Mouse or human plasma was incu- 
bated with polylysine or MAA-polylysine beads. Bound proteins were analysed by 
LC-MSMS. The interaction was verified by immunoblotting and characterized by 
ELISA and Biacore. 

Flow cytometry and immunohistochemistry. The presence of MDA epitopes 
and CFH was visualized by flow cytometry on apoptotic Jurkat T-cell micro- 
particles and by immunohistochemistry for histological specimens. 

Co-factor assay. CFH bound to coated MAA-BSA was incubated with C3b and 
factor I and the generation of iC3b fragments was visualized by immunoblotting. 
Cell culture. Following stimulation with BSA or MAA-BSA in the presence or 
absence of CFH, gene expression was determined by quantitative RT-PCR and/or 
IL-8/KC secretion was quantified by ELISA. Binding of biotinylated MAA-LDL 
to peritoneal macrophages was assessed in the presence or absence of competitors 
as described’. 

Statistical analysis. Data are presented as mean + s.d. or mean + s.e.m. where 
indicated. Results were analysed by one-way analysis of variance and Student’s 
unpaired t-test. 
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Detectable radio flares following gravitational waves 
from mergers of binary neutron stars 


Ehud Nakar! & Tsvi Piran? 


Mergers of neutron-star/neutron-star binaries are strong sources of 
gravitational waves’ °. They can also launch subrelativistic and mildly 
relativistic outflows** and are often assumed to be the sources of short 
y-ray bursts’. An electromagnetic signature that persisted for weeks to 
months after the event would strengthen any future claim of a detec- 
tion of gravitational waves’®. Here we present results of calculations 
showing that the interaction of mildly relativistic outflows with the 
surrounding medium produces radio flares with peak emission at 1.4 
gigahertz that persist at detectable (submillijansky) levels for weeks, 
out to a redshift of 0.1. Slower subrelativistic outflows produce flares 
detectable for years at 150 megahertz, as well as at 1.4 gigahertz, from 
slightly shorter distances. The radio transient RT 19870422 (ref. 11) 
has the properties predicted by our model, and its most probable 
origin is the merger of a compact neutron-star/neutron-star binary. 
The lack of radio detections usually associated with short y-ray bursts 
does not constrain the radio transients that we discuss here (from 
mildly relativistic and subrelativistic outflows) because short y-ray 
burst redshifts are typically >0.1 and the appropriate timescales 
(longer than weeks) have not been sampled. 

Gravitational-wave detectors, and in particular the advanced LIGO 
and Virgo interferometers, are being constructed now with the goal of 
detecting gravitational waves from binary neutron-star coalescence at 
distances up to a few hundred megaparsecs (redshift z ~ 0.1)'*. The 
detection of an accompanying electromagnetic signal would comple- 
ment these efforts, providing an independent confirmation of the 
discovery and increasing the detectors’ effective sensitivity. The search 
for such an electromagnetic signal has therefore attracted much interest. 
The radioactive decay of ejected debris from the merger would drive a 
short-lived supernova-like event’*. For example, ejection of 0.01 solar 
masses (0.01M.) from a merger at a distance of 300 Mpc would result 
in a faint optical flare that peaks after ~1 day (ref. 14). Finding, and 
especially identifying, such rare and faint events in the crowded variable 
optical sky is an extremely challenging task. Other authors have specu- 
lated on the production of low-frequency radio signals from the inter- 
action of the neutron stars’ magnetic fields'*"’”. These attempts focused 
on electromagnetic signals that are contemporaneous with, or follow 
quickly, the merger and the gravitational waves. Unfortunately, these 
predictions are highly uncertain. 

Here we predict a robust radio signal that peaks several weeks after 
the merger. Numerical simulations show that compact binary mergers 
launch energetic subrelativistic and mildly relativistic outflows** 
Ejection sources include unbound tidal tails, and winds driven by 
neutrino heating, nucleosynthesis and electromagnetic processes'*', 
emerging from the proto-neutron star or from an accretion disk. 
Overall, almost all merger models find a significant ejection of mass 
and energy. In binary neutron-star mergers, an ejection of about 
10°° erg at (0.1-0.2)c (where c is the speed of light) and about 
10” erg as faster ejecta is a fairly robust prediction. The outflow from 
black-hole/neutron- a mer igers is less certain, but it is possibly more 
energetic and faster'* (~ 10° erg at 0.5c). 

The interaction of A outflow with the surrounding tenuous matter 
generates a blast wave. Although the outflow may be highly non-uniform 


initially, it becomes spherical rather quickly. We therefore consider a 
spherical outflow with energy E and an initial velocity cf; that propa- 
gates into a medium with a constant density, n. If the outflow is not 
ultrarelativistic, it propagates at a constant velocity until time tg. 
when, at a radius Rae, it collects a mass comparable to its own. Time 
taec (in days) is given by 


30 Bn is 5/3 (1) 


‘ Rae 
dec = cB, 
Here and in the following, unless stated otherwise, q, (where q is any 
parameter) denotes the value of q/10* in c.g.s. units. At a radius 
R> Raeo the flow decelerates, assuming a Sedov-Taylor blast wave: 
B oa Bi(R/Raec) >”. 

The blast wave generates magnetic fields and accelerates particles 
that emit synchrotron radiation. The same microphysics used success- 
fully to model radio emission of type Ibc supernovae’, where f ~ 0.2, 
and to model late radio emission of y-ray bursts**”°, where the flow is 
mildly relativistic, is applicable here. In both cases, the electrons and the 
magnetic field are found to carry significant fractions of the total 
internal energy of the shocked gas, ¢. ~ é3 ~ 0.1. The observed spectra 
reveal a power-law distribution of the electrons’ Lorentz factor, y: dN/ 
dy x y ? fory>Ym = [(p —2)/(p — I) (mp /me)eeB”, where m, and Me 
are the proton and electron masses, respectively, y is the painimaal 
Lorentz factor of the electron’s distribution and p ~ 2-3, 

The radio spectrum is determined by v,,, the synchrotron frequency 
of electrons with Lorentz factor j,,, and by v,, the synchrotron self- 
absorption frequency (Supplementary Information). The specific flux, 
Fy, at a given’ peony is strongly suppressed below v,, and it 
decreases as v above v,, and v,. The signal across the whole 
spectrum increases at t < tdec. Its behaviour after ta.. depends on the 
relation between the observed frequency, Vops, and Vv, and vz. The 
signal peaks at tacc if Vobs > Vadeo Ymdeo Where Va dec = Valtaec) and 
Vin,dec = Vm(tdec). Otherwise the signal peaks when Vops = Vm or when 
Vobs = Va Whichever is latest. 

The flare characteristics are most sensitive to the initial velocity of 
the outflow, /;. Because the brightest radio emission is observed at taecs 
a lower value of ; implies a longer rise time of the radio emission after 
the merger. Additionally, Vin decs Vadec and the peak flux at any observed 
frequency depend strongly on f;. As mergers are expected to eject an 
outflow over a range of velocities, we discuss separately below the 
observed signature of mildly relativistic (f;~ 1) and subrelativistic 
(f, ~ 0.2) ejecta. 

A mildly relativistic blast wave with canonical parameters produces 
a synchrotron spectrum with va dec = Vindec ~ 1 GHz. The strongest 
signal is then expected at time tae. (a few weeks after the merger) 
and around 1.4GHz (Supplementary Information): the peak of the 
observed specific flux at v,, in units of millijanskys is 


F,,.,,peak[Vobs > Vm,dec, Va dec] = 
p+l pti Pe 2 Vens = 
27 1.4 


_ 2) 
it ail = ( 
0.3Es99° Spite i 
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where d is the distance to the merger (here v,p,5 is in GHz). The peak 
flux at lower frequencies (<1 GHz) is significantly lower and it is 
observed at a later time. If the outflow is subrelativistic (f; ~ 0.1- 
0.2), then Vadecs Vm,dec = 150 MHz and equation (2) is applicable also 
in the frequency range of low-frequency radio detectors. The flux 
peaks at taec, which is of the order of years, and it is brighter at 
150 MHz than at 1 GHz by about an order of magnitude. Note that 
over the whole expected range of blast wave parameters, v, = 1 GHz at 
all times and the spectrum above 1 GHz is optically thin during the 
entire evolution. We stress that in radio supernovae, the surrounding 
dense winds lead at early time to an optically thick spectrum at 
Vops > 1 GHz, and the transition v, = v,,, determines the time and flux 
at the peak. As discussed below, this different spectral signature 
enables us to distinguish between merger flares and radio supernovae. 

The circum-merger density also strongly affects the flare signature. 
For example, if the surrounding particle density is ~10 *cm °, the 
peak flux from a mildly relativistic ejecta decreases to the microjansky 
level at a distance of 300 Mpc, and the timescale increases by a factor of 
ten, to a year. A merger taking place in such a density can be detected 
only up to distances of ~100 Mpc (Table 1.) The density is expected to 
vary significantly, from n ~ 1 cm °,in galactic disks, ton ~ 10 °cm °, 
for mergers taking place outside their host galaxies. Because all observed 
Galactic neutron-star binaries reside within the Galactic disk, where the 
average density is n ~ 1 cm’ *, a significant fraction of the cosmological 
mergers are expected also to take place in rather dense environments. 
We therefore use 1 = 1 cm * as the canonical density value. If mergers 
produce short y-ray bursts, then observations of their afterglows sup- 
port this value (Supplementary Information). 

An intriguing possibility is that compact merger events also eject 
ultrarelativistic jets that produce short y-ray bursts’ (SGRBs). It is 
important to examine the relationship between SGRBs and the radio 
flares discussed here, assuming that mergers are producing SGRBs. An 
SGRB beamed towards us will be observed in coincidence with the 
gravitational-wave signal, providing a clear electromagnetic counter- 
part. Even if the SGRB itself is missed, owing to partial sky coverage, 
its afterglow will be easily detectable. However, SGRBs are expected to 
be beamed, and only rarely will one point towards us. A beamed SGRB 
observed off-axis produces, once it has slowed down, a long-lasting 
radio ‘orphan’ afterglow”, similar in its characteristics to the mildly 
relativistic signal discussed above. However, the total energy (corrected 
for beaming) in the ultrarelativistic jet is at most comparable to—and 
probably lower than—that of the mildly relativistic ejecta. Consequently 
the latter will dominate the radio emission. 

The radio remnant signals that we consider here, which are generated 
by subrelativistic and mildly relativistic outflows, could not have been 
detected in the radio afterglow searches that were carried out following 
SGRB triggers. The reasons are twofold. First, SGRBs are typically 
detected at distances of 1-3 Gpc, far beyond the detection horizon of 
gravitational-wave detectors. Hence the signals are much weaker than 
those associated with detected gravitational-wave events. Second, 
SGRB afterglow searches are optimized to detect the emission from 
ultrarelativistic ejecta pointing towards the observer. Such emission 


Table 1 | Observing radio flares 


LETTER 


peaks at higher frequencies and on shorter timescales than the emission 
from mildly and subrelativistic ejecta discussed here. SGRB afterglow 
searches are done at a sensitivity of ~0.1 mJy at 4.8-8.5 GHz during the 
first week or two after the bursts (see, for example, refs 27-29). Equation 
(2) implies that over the distance range 1-3 Gpc, the radio signal of 
mildly relativistic ejecta with energy 10°’ erg that propagate into a 
medium of density n=1cm ° peaks after ~60days at a flux of 
~0.01-0.1 mJy at 5 GHz. The flux before the peak rises as (t/taec)? 
(Supplementary Information). Therefore, these early radio afterglow 
searches could not have detected the radio signal, even if the mildly 
relativistic ejecta had an energy of 10°" erg. Thus, the paucity of detected 
SGRB radio afterglows has no direct implication for the nature of the 
remnants we discuss here. 

A new wave of radio detectors is now coming online. The most sensi- 
tive operate at frequencies of 1.4 GHz and higher. Table 1 summarizes 
the relevant properties of these facilities and their detection horizons. The 
best facility for a targeted search, following a detection of a candidate 
gravitational-wave source, is clearly the EVLA. A deep, ~50 Jy, loca- 
lized EVLA search of the 10-100 deg’ error box of a gravitational-wave 
trigger” can detect mildly relativistic ejecta (with an energy of even 
~10"* erg) out to the horizon of advanced gravitational-wave detectors. 
The upcoming lower-frequency LOFAR sensor array will be more effec- 
tive in searches for subrelativistic outflows, whose signals peak at 
LOFAR’s frequencies, thus compensating for LOFAR’s lower sensitivity. 
LOFAR is also relatively more effective when searching for flares in a low- 
density medium. 

Even before the completion of the advanced gravitational-wave 
detectors, blind searches can identify radio flares from compact binary 
mergers. Identification of radio emission from any merger type (binary 
neutron star or black hole/neutron star) would determine the merger 
rate, which is a parameter of utmost importance for the design and 
operation of advanced detectors. With a merger rate of 300 Gpe * yr ', 
we expect these facilities to detect ~20 remnants from mildly relativistic 
outflows with an energy of E=10*erg (and ~1,000 remnants if 
E=10°°erg), in a single whole-sky snapshot. LOFAR may detect a 
dozen transients in a whole-sky survey, even if only a subrelativistic 
outflow with an energy of 10°’ erg is ejected (see Supplementary 
Information for details, as well as for a discussion of ways to distinguish 
these flares from other possible radio transients). 

Remarkably, the observed 5 GHz transient RT 19870422 (ref. 11) 
shows all the expected properties of the radio remnant of a compact 
binary merger. At 1 Gpc distance and with a duration of two months, 
this transient is what we would expect from a mildly relativistic outflow 
with an energy of ~10°° erg. The inferred rate of similar transients"', 
80-20,000 Gpe *yr', is fully consistent with the estimates of com- 
pact binary mergers. This transient is therefore an excellent candidate 
to be the first observed radio remnant of a merger. Unfortunately, 
we cannot rule out the possibility that this is an especially bright 
radio supernova''. We note, however, that this latter interpretation 
requires a supernova brighter by an order of magnitude than any radio 
supernovae previously observed. Simultaneous optical observations or 
multiwavelength radio observations could have easily distinguished 


Radio facility Observing Field of One-hour One-hour detection horizon+ Ten-hour detection horizont 
frequency (GHz) view (deg?) r.m.s.* (jy) 
Bi~l, Bi~l, Bi = 0.2, Egg = 10, Bi~ 1, Eso = 1, 
E49 =1,n9=1 E49 = 10, no =1 no=1,p=2.5 Ngo = 1073, p=2 
EVLA 14 0.25 i 1Gpc 3.3 Gpc 370 Mpc 140 Mpc 
ASKAP. 14 30 30 500 Mpc 1.6 Gpc 180 Mpc 70Mpc 
MeerKAT 14 15 35 500 Mpc 1.6Gpc 165 Mpc 65 Mpc 
Apertif 14 8 50 400 Mpc 1.25 Gpc 140 Mpc 50 Mpc 
LOFAR 0.15 20 1,000 35 Mpc 90 Mpc 70Mpc 20 Mpc 


Shown are properties and detection horizons (neglecting cosmological corrections) for an observation at different radio facilities of blast waves with various values of f;, E49, No and p (in all cases, ¢. ~ ¢g ~ 0.1; see 
text for definitions of these symbols). Information on facilities is available as follows: EVLA (http://www.aoc.nrao.edu/evia); ASKAP (http://www.atnf.csiro.au/projects/askap/technology.html); MeerKAT (http:// 
www.ska.ac.za/meerkat); Apertif (http://www.astron.nl/general/apertif/apertif); and LOFAR (http://lofar.org). 

*The root mean squared value of the background noise for one hour of observation. 

+The distance at which the observed peak flux is four times the one-hour r.m.s. 

{The distance at which the observed peak flux is four times the root mean squared value of the background noise for ten hours of observation. 
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between the two possibilities. Unfortunately, no such observations are 
available. However, the detection rate implied by this event is very 
high, indicating that similar events could easily be detected by a rela- 
tively small-scale survey and that their nature should be easily probed. 
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Resonances arising from hydrodynamic memory in 


Brownian motion 


Thomas Franosch!*, Matthias Grimm??*, Maxim Belushkin*, Flavio M. Mor’, Giuseppe Foffi*, Laszlo Forro? & Sylvia J eney’? 


Observation of the Brownian motion of a small probe interacting 
with its environment provides one of the main strategies for 
characterizing soft matter’ *. Essentially, two counteracting forces 
govern the motion of the Brownian particle. First, the particle is 
driven by rapid collisions with the surrounding solvent molecules, 
referred to as thermal noise. Second, the friction between the particle 
and the viscous solvent damps its motion. Conventionally, the ther- 
mal force is assumed to be random and characterized by a Gaussian 
white noise spectrum. The friction is assumed to be given by the 
Stokes drag, suggesting that motion is overdamped at long times in 
particle tracking experiments, when inertia becomes negligible. 
However, as the particle receives momentum from the fluctuating 
fluid molecules, it also displaces the fluid in its immediate vicinity. 
The entrained fluid acts back on the particle and gives rise to long- 
range correlations”®. This hydrodynamic ‘memory’ translates to 
thermal forces, which have a coloured, that is, non-white, noise 
spectrum. One hundred years after Perrin’s pioneering experiments 
on Brownian motion’”®, direct experimental observation of this 
colour is still elusive’®. Here we measure the spectrum of thermal 
noise by confining the Brownian fluctuations of a microsphere in a 
strong optical trap. We show that hydrodynamic correlations result 
in a resonant peak in the power spectral density of the sphere’s 
positional fluctuations, in strong contrast to overdamped systems. 
Furthermore, we demonstrate different strategies to achieve peak 
amplification. By analogy with microcantilever-based sensors’, 
our results reveal that the particle-fluid-trap system can be con- 
sidered a nanomechanical resonator in which the intrinsic hydro- 
dynamic backflow enhances resonance. Therefore, instead of being 
treated as a disturbance, details in thermal noise could be exploited 
for the development of new types of sensor and particle-based assay 
in lab-on-a-chip applications'*"™*. 

Einstein’s theory of Brownian motion”’, published in 1905, received 
considerable attention and was later reformulated in terms ofa Langevin 
equation’®. In it, particle motion is driven by thermal fluctuations 
induced through collisions with the fluid molecules. These rapid ‘kicks’ 
are assumed to be random and independent at frequencies much smaller 
than the collision rate of ~1 THz. The thermal force consequently has a 
white noise spectrum’®; that is, the spectrum is constant over a wide 
range of frequencies. Momentum is transferred from the particle to the 
fluid at times t, = m,/y (Fig. 1a, left), where y = 6myR is the coefficient 
of static friction of the particle for macroscopic no-slip boundary con- 
ditions, Mp is the particle’s mass, 77 is the shear viscosity of the fluidand R 
is the radius of the particle (which is taken to be spherical). However, 
when the densities of the particle and the fluid, Pp and, respectively, pg 
are comparable, their coupling becomes important’”'*. As the sus- 
pended particle fluctuates through the solvent, long-range correlations 
build up as a result of momentum exchange, leading to hydrodynamic 
memory in the solvent. Hence, an additional timescale, t,= R° py n, 
which describes the time needed by the perturbed fluid flow field to 


diffuse over one particle radius (Fig. la, middle), becomes important. 
According to the fluctuation-dissipation theorem, the statistics of the 
thermal force, Fy,(t), is characterized by a delta-correlated white noise 
term and a coloured, frequency-dependent component that reflects the 
retarded viscous response of the fluid continuum to the particle. 

To measure directly the predicted correlations in thermal noise, we 
combined strong optical trapping with high-resolution, 3D position 
detection’” (Supplementary Information, section 1). The resulting 
force balance for the particle reads m,x(t) = Fy-(t) —Kx(t)+Fun(t), 
where x(t) is the particle’s displacement from the trap centre (with a 
dot denoting differentiation with respect to time), F;,(f) is the non- 
instantaneous friction force on the particle and K is the stiffness of the 
optical trap. This harmonic restoring force gives the trap relaxation 
time of tx = y/K (Fig. la, right). At long times, strong trapping even- 
tually dominates over friction and becomes the main force counter- 
acting thermal excitation. The Langevin equation reduces then to 
Kx(t) ~ F,,(t). Consequently, when tracking the fluctuating motion 
of the particle in a strong harmonic potential (Fig. 1b, c), we effectively 
probe the thermal force of the fluid’. Correlations in thermal noise 


(Fin(t) Fun (0))~K?(x(t)x(0)) 


a Ballistic motion Backflow Trapped motion 


ees 


Inertial is a a lls Hydrodynamic regime 


~ i aE as aa ~T; ~TK 


Logarithmic time 


T= 100 ms 


Figure 1 | Characteristic time scales of a Brownian particle confined by the 
three-dimensional (3D) harmonic potential of an optical trap. a, On very 
short timescales (t< t,), the particle undergoes ballistic motion governed by 
My (left). On timescale t, hydrodynamic backflow develops (centre; solid lines 
show the emerging fluid velocity field; arrows are obtained from our computer 
simulations). Finally, for tf = tx, the harmonic potential of the trap sets in and 
confines particle diffusion (right). b, Trajectories of the trapped sphere 
measured at three different time intervals, in the dimensions, x and y, lateral to 
the optical axis, z. c, 3D position histogram of the same sphere after the 
measurement time J ~4 ms. The small displacements of the bead are indicative 
of the strong trapping forces. 
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become directly accessible through the positional autocorrelation 
function, PAF(t) = (x(t)x(0)), calculated from the recorded fluctua- 
tions of the trapped sphere (Fig. 1b). 

The sequential order and magnitudes of the various timescales 
depend on the size and mass of the Brownian particle, the nature of 
the solvent and the stiffness of the optical trap. The experimental chal- 
lenge consists of exploring a wide range of timescales. Therefore, we 
optimized our set-up” to achieve a resolution of ~1 nm in space and 
close to 1 ps in time (Supplementary Information, section 1). In typical 
optical tweezers experiments, silica or polystyrene spheres immersed in 
water and with sizes <1 jm are used. This yields values for t¢and t, of 
less than 1 ys, which is below our temporal resolution limit*’. Therefore, 
we instead used melamine resin beads with diameters of between 2 and 
3 um and suspended them in acetone, which is three times less viscous 
than water. In this set-up, Tp ~ 1-3 Us and t;~ 2-6 us. Furthermore, 
the difference between the refractive indices of resin (n = 1.68) and 
acetone (n = 1.36) was high enough to provide good trapping effi- 
ciency. With such an experimental configuration, we could increase 
t, and K sufficiently to bring the trap relaxation time, tx, close to T.. 
This made the window in which mainly thermal force correlations 
determine the bead’s dynamics (Fig. 1a, middle) experimentally access- 
ible. On these timescales, the mass of the particle is already negligible, 
leading to a clear separation between the inertial and hydrodynamic 
regimes of Brownian motion (Supplementary Information, section 5). 

Figure 2a, b shows the mean squared displacements and positional 
autocorrelation functions calculated from the measured position fluc- 
tuations, x(t) (Supplementary Information, section 3), of a single resin 
sphere immersed in water (green circles) or in acetone (blue circles) and 
held with comparable optical forces. PAF(t) has a clear zero-crossing 
followed by anticorrelations. The appearance of anticorrelations is in 
remarkable contrast with the exponential relaxation, (kg T / K Je t/ TK, 
characteristic of overdamped harmonic oscillators subject to instant- 
aneous Stokes friction — yx(t). In the frequency domain (Fig. 2c), the 
corresponding power spectral density, PSD(f), shows that increasing ts 
and hence decreasing the ratio t/t; by reducing the fluid’s viscosity, 
caused the emergence of a resonance. This resonant peak indicates that 
the thermal force spectrum is enhanced as frequency increases (blue 
circles). In water (green circles), the maximal corner frequency, 
J = 1/201x, we obtained resulted in an enhancement of the PSD close 
to our noise limit. Nevertheless, deviations from the simple Lorentzian, 
PSD(f) = 2(kgT/K)tx/[1 + (2nftx)’], of overdamped systems (green 
line) are clearly visible for frequencies around fx. 

For a quantitative description, we solve a Langevin equation with 
no-slip boundary conditions accounting for slow vortex diffusion'”"* 
and trapping’. Our data (Fig. 2, symbols) are in excellent agreement 
with the theoretical expression (Fig. 2, black lines) over three decades in 
time and four orders of magnitude in signal (Fig. 2b), and we observe a 
hydrodynamic power-law tail, PAF(t)~—(kgTy/K7)./t,/4nt3, 
which by equation (1) directly reflects the corresponding persistent 
correlations in the thermal forces 


(Fin(t) Fin (0)) = —kgTy,/ t¢/4nt—3/? 


for times for which compressibility effects from the fluid can be ignored 
(Supplementary Information, section 4). The negative overshoot in the 
PAF and the equivalent resonant peak in the PSD originate solely from 
the hydrodynamic coupling between the fluid and the particle. 


In the Fourier domain, positional fluctuations x(f) = [x(tyernt dt 


are connected to the thermal forces by a linear relation, x(f)= 
G(f)Fin(f), where G(f) is the Fourier transform of the Green func- 
tion (Supplementary Information, section 4). Consequently, x(f) is a 
filtered signal of the noise Fiy(f), and the PSD of the Brownian ther- 
mal noise, PSD,,(/), is related to the measured PSD by 


PSDin(f)=|G(f)|_ PSD(f) (2) 
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Figure 2 | The colour of thermal force. a, Double-logarithmic plot of the 
mean squared displacement (MSD) normalized to its long-time limit, 2kgT/K 
(kg, Boltzmann’s constant), for an optically trapped (K ~ 205 uN m ’), 
melamine resin sphere (R = 1.45 um) in water (green circles: t¢ = 2.3 pls, 

Tx = 138.0 pts) or acetone (blue circles: tr = 5.1 Ls, Tx = 38.3 tts). b, Double- 
logarithmic plot of the corresponding PAF(t) blocked in 100 bins per decade 
and normalized to its initial value, PAF(0) = kgT/K. The persistent 
anticorrelations are visible after the zero-crossing (narrow spike) and follow a 
t *” power-law decay. The green and blue lines indicate exponential 
relaxations and serve as guide to the eye. Inset, log-linear plot of the same data. 
c, Log-linear representation of the corresponding PSD blocked in 50 bins per 
decade and normalized to its zero-frequency value, 2kgTtx/K. The green and 
blue lines are Lorentzian spectra for reference. Inset, magnified view of the 
enhancement of the PSD, blocked in 20 bins per decade, reflecting the colour of 
thermal noise. d, Direct representation of the PSD of F,, (equation (2)). The 
black lines correspond to the full hydrodynamic theory including inertial 
effects’. The parameters t; and tx were extracted from the fit to the theory. 
Error bars, 1 s.e. of the mean from blocking. 


The data shown in Fig. 2d confirm the departure from white noise 
through a drastic increase in thermal noise at higher frequencies. 
Deviations from Gaussian white noise are towards the blue end of 
the spectrum at frequencies that are much smaller than the collision 
rate of the solvent molecules, and reflect the colour of thermal force'”. 

The observed resonance in the PSD can be enhanced by decreasing 
the ratio tx/t, and hence increasing K. Figure 3a shows peak amp- 
lification with increasing laser power up to a stiffness of 412 WN m |. 
In the hydrodynamic regime, Brownian motion is strongly sensitive to 
particle size because the determinant timescale, ts is proportional to 
R?. A difference in the bead radius, AR, of only a few per cent results in 
a detectable shift in the PAF around its zero-crossing, in acetone as well 
as in water (Fig. 3b). 

Experimental access to short timescales reveals a resonance in 
Brownian motion where overdamped motion is commonly assumed. 
Fora given solvent and particle, it is possible to investigate the dynamics 
of the system in different regimes by decreasing tx and detecting the 
position fluctuations at the highest bandwidth. Stronger and narrower 
resonances can be obtained in the inertial regime, where Brownian 
motion is also sensitive to the particle’s mass**”’. To reach this window, 
Tx has to be brought close to t, by increasing K or mp. Although heavier 
particles, which simultaneously allow for more efficient trapping, are 
still to be developed’*, the timeline displayed in Fig. 1a can be explored 
theoretically and by means of computer simulations. Transition to the 
inertial regime is marked by the appearance of a peak in the case of the 
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Figure 3 | Enhancing resonance and sensitivity to particle size. a, Log-linear 
representation of the normalized PSD, blocked in 50 bins per decade, of a resin 
sphere (R = 1 um) in acetone for increasing trap stiffness (green, 

K=77uUN m 3; red, 309 uN m_/; blue, 412 uN m ').b, Magnified view of a 
log-linear plot of the normalized PAF(t), blocked in 100 bins per decade, close 
to the zero-crossing for resin spheres of slightly different radii held by traps of 
the same stiffness, in acetone (blue, R = 1.45 tm; red, 1.35 tm; 

K~205 uN m’) and water (green, 1.50 j1m; magenta, 1.45 um; 
K~195uNm_'). The black lines in each plot correspond to the full 
hydrodynamic theory’. Data were acquired and processed as described in 
Fig. 2. Error bars, 1 s.e. of the mean from blocking. 


harmonic oscillator with tx < 2t,. When tx is further decreased, the 
mass term in the Langevin equation eventually becomes larger than the 
friction term. Interestingly, in comparison with the simple harmonic 
oscillator (Fig. 4a, dashed lines), the peak is significantly enhanced and 
its position, finax, is shifted to lower frequencies by the contribution of 
hydrodynamic memory (Fig. 4a, solid lines). 


Inertial regime, 


Oasat™)asd 


S\N] —tk/tp = 2.4 x 
0.0 _ a - 


100 
K(kgT/b?) 


1071 100 102 107 0 


2ntt; 


Figure 4 | Transition to the inertial regime. a, Theoretical PSD of a resin 
sphere with no-slip boundary conditions in acetone (R = 1.5 um, Tp = 2.3 Us, 
Tt, = 5.4 Us), for very strong traps (blue, K = 1.02 mN m/; green, 

2.04mN m_!; red, 4.07 mN m_’). The dashed lines show the results for a 
damped harmonic oscillator’®, for comparison. The coloured circles represent 
the corresponding PSD,,<(/) with g = 50% and fexc ~ 2fpeak- Inset, 
corresponding fluid velocity field developing around the sphere when moving 
in the x direction at f= 1/2mt,. The arrows indicate the direction of the velocity 
field. b, Simulation data (filled circles) for an equivalent system with t, = 0.7, 
evaluated in a compressible fluid under full-slip conditions, y = 4n7R. The 
coloured theoretical lines account for vortex diffusion, as well as for sound 
waves. The open circles are simulation data for an excitation with g = 50% and 
fexe = 2fmax- Inset, corresponding fluid velocity field at f= 1/27, decreasing 
from red to green. All curves in a and b are normalized to the zero-frequency 
value of the respective non-excited PSD. c, Normalized peak height for 
increasing trap strength, calculated for particles of different sizes and densities 
in acetone, where b is the unit length of simulations (Supplementary 
Information, Section 7) (thick lines; blue: 2b = R = 1.0 um, Pp = 1,510 kg m”; 
green: R= 1.0 um, pp = 3,020kgm~*; red: R = 2.0 um, pp = 1,510kgm_*) 
and compared with the harmonic oscillator (dashed lines). Simulation data 
(symbols) are compared to the corresponding theory (thin lines). 
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The study of Brownian motion on short timescales in a medium 
with hydrodynamic effects has also become accessible by advanced 
simulation techniques. We used multiparticle collision dynamics 
(see ref. 27 and references therein) with molecular dynamics coupling 
between solute and solvent particles. The method yielded a compres- 
sible solvent and correctly reflected the hydrodynamic effects at 
coarse-grained scales. The approach was implemented most conveni- 
ently for full-slip boundary conditions at the solute-solvent interface, 
whereas our experiments obeyed no-slip conditions. The simulation 
results for the PSD show that a resonance emerges in the hydro- 
dynamic regime as well as in the inertial regime (Fig. 4b, filled circles), 
irrespective of boundary conditions at the solvent-solute interface. 
However, the weak coupling between the Brownian particle and the 
surrounding fluid yielded a resonance much weaker than that which 
emerged under no-slip conditions (Fig. 4a). The collected data follow 
the theoretical curves (Fig. 4b, coloured lines), where friction is eval- 
uated for a compressible fluid under full-slip conditions (Supplemen- 
tary Information, sections 4 and 7). 

The enhanced resonance is sensitive to the size of the bead in the 
hydrodynamic regime (Fig. 4c, red curves versus green and blue 
curves). Also, it is mass sensitive in the inertial regime of the simulation 
and, more markedly, under experimental conditions. In contrast, for 
the harmonic oscillator, sensitivity to particle size is much lower and 
sensitivity to its mass occurs only in the underdamped regime, where 
TK < 2p (Fig. 4c, dashed lines). 

Additional peak amplification can be achieved through parametric 
resonance*’**”° by periodically modulating the trap strength at fre- 
quency fixe K(f) = K[1 + gcos(2nf,..t)]. We obtained the theoretical 
excited PSD, PSD,,(f), normalized to the initial value of the non- 
excited PSD, from the solution of a parametrically modulated 
Langevin equation, including hydrodynamic memory, using second- 
order perturbation theory in the reduced modulation amplitude, g 
(Supplementary Information, section 6). As for a harmonic oscillator, 
also in the presence of coloured friction, the greatest additional peak 
amplification is achieved at a frequency of fexc ~ 2fpeak» Which yields an 
increase of up to 20% when g= 50% (Fig. 4a, open circles). Com- 
parable results were obtained with computer simulations (Fig. 4b, open 
circles). 

Exploiting the hydrodynamic and inertial regimes of Brownian 
motion for particle-based assays will become a common technological 
approach'**. We anticipate that changes in the particle’s morphology, 
such as swelling or a reaction occurring at its surface, will alter short- 
time dynamics and become detectable. As single cells, microorganisms 
and microcarriers can also be bound harmonically’*”*, short-time 
detection of their Brownian fluctuations may become a sensitive way 
to characterize their state or evolution in native solutions and without 
specific markers. Reciprocally, changes in the medium surrounding 
the probing particle modulate the particle’s fluctuation spectrum’*, 
offering a means of studying dynamic polymer systems in great detail. 


METHODS SUMMARY 


Melamine resin microspheres (pp = 1,510 kg m *,R=1.5,1.45,1.350rl |um) were 
suspended in high-purity acetone (7 = 0.32 cP (1cP = 1 mPas), pp = 790 kg m °) 
or water (7 = 0.95 cP, pp = 1,000 kg m_°) at minimal concentrations to allow trap- 
ping and observation of a single particle. After loading, the sample chamber was 
mounted onto the 3D piezo stage of our custom-made inverted microscope/optical 
trap set-up”. A bead was trapped in the focus of a Gaussian trapping beam pro- 
duced by a diode-pumped, ultralow-noise Nd:YAG laser with a wavelength of 
2 = 1,064 nm and a maximal output power of 500 mW in continuous-wave mode. 
At the focus, the remaining power was measured to be 200 mW for the stiffest traps 
used. To avoid surface effects, the trapped bead was brought by the piezo stage no 
closer than 40 jum to the top or bottom glass surface of our sample chambers, which 
were more than 100 jum thick. Fluctuations in the position of the bead were detected 
in 3D by an InGaAs quadrant photodiode” with a diameter of 2.0 mm. The signals 
from the quadrant photodiode were fed into a custom-built preamplifier, which 
provided two differential signals between the photodiode quadrants, giving the 
fluctuations in the x and y directions, and one signal that is proportional to the 


6 OCTOBER 2011 | VOL 478 | NATURE | 87 


©2011 Macmillan Publishers Limited. All rights reserved 


LETTER 


total light intensity, yielding the fluctuation in the direction parallel to the optical 
axis, z. Subsequently, we used differential amplifiers to adjust the preamplifier 
signals for optimal digitalization by the data acquisition board with a dynamic 
range of 12 bits. All data were collected for T ~50s at a sampling rate of 1 MHz, 
corresponding to ~5 X 10’ data points. The PSD presented in Figs 2 and 3 were 
computed from overlapping windows of 2”* points. 
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The simple mechanical oscillator, canonically consisting of a coupled 
mass-spring system, is used in a wide variety of sensitive measure- 
ments, including the detection of weak forces' and small masses”. On 
the one hand, a classical oscillator has a well-defined amplitude of 
motion; a quantum oscillator, on the other hand, has a lowest-energy 
state, or ground state, with a finite-amplitude uncertainty corres- 
ponding to zero-point motion. On the macroscopic scale of our 
everyday experience, owing to interactions with its highly fluctuat- 
ing thermal environment a mechanical oscillator is filled with many 
energy quanta and its quantum nature is all but hidden. Recently, in 
experiments performed at temperatures of a few hundredths of a 
kelvin, engineered nanomechanical resonators coupled to electrical 
circuits have been measured to be oscillating in their quantum 
ground state**, These experiments, in addition to providing a 
glimpse into the underlying quantum behaviour of mesoscopic 
systems consisting of billions of atoms, represent the initial steps 
towards the use of mechanical devices as tools for quantum 
metrology”* or as a means of coupling hybrid quantum systems’ ’. 
Here we report the development of a coupled, nanoscale optical and 
mechanical resonator” formed ina silicon microchip, in which radi- 
ation pressure from a laser is used to cool the mechanical motion 
down to its quantum ground state (reaching an average phonon 
occupancy number of 0.85+0.08). This cooling is realized at an 
environmental temperature of 20K, roughly one thousand times 
larger than in previous experiments and paves the way for optical 
control of mesoscale mechanical oscillators in the quantum regime. 

It has been known for some time’ that atoms and ions nearly 
resonant with an applied laser beam (or series of beams) may be mech- 
anically manipulated—even trapped and cooled down to the quantum 
ground state of their centre-of-mass motion’’. Equally well known’ is 
the fact that radiation pressure can be exerted on ordinary (that is, non- 
resonant) dielectric objects to damp and cool their mechanical motion. 
In ‘cavity-assisted’ schemes, the radiation pressure force is enhanced by 
coupling the motion of a mechanical object to the electromagnetic field 
in a resonant cavity. Pumping of the cavity by a single-frequency 
electromagnetic source produces a coupling between the mechanical 
motion and the intensity of the electromagnetic field built up in the 
resonator. Because the radiation pressure force exerted on the mech- 
anical object is proportional to the field intensity in the resonator, a 
form of dynamical back-action results’’’. For a lower-frequency (‘red’) 
detuning of the pump source from the cavity, this leads to damping and 
cooling of the mechanical motion. 

Recent experiments involving micro- and nanomechanical resonators 
coupled to electromagnetic fields at optical and microwave frequencies 
have demonstrated significant dynamic back-action due to radia- 
tion pressure’*. These structures have included Fabry—Pérot cavities 
with mechanically compliant miniature end mirrors'*’* or internal 
nanomembranes”’, whispering-gallery glass resonators”, nanowires 
capacitively coupled to co-planar microwave transmission line 
cavities®*' and lumped-circuit microwave resonators with deformable, 


nanoscale, vacuum-gap capacitors’. The first measurement of an 
engineered mesoscopic mechanical resonator predominantly in its 
quantum ground state, however, was performed not using back-action 
cooling but rather using conventional cryogenic cooling (bath temper- 
ature, T,~25mK) of a high-frequency and, thus, low-thermal- 
occupancy oscillator’. Read-out and control of mechanical motion at 
the single-quantum level was performed by strongly coupling the 
gigahertz-frequency piezoelectric mechanical resonator to a resonant 
superconducting quantum circuit. Only recently have microwave sys- 
tems, also operating at bath temperatures of T;, ~ 25 mK, used radiation 
pressure back-action to cool a high-Q-factor, megahertz-frequency 
mechanical oscillator to the ground state*”". 

Optically coupled mechanical devices, although they allow for control 
of the mechanical system through well-established quantum optical 
techniques”, have thus far not reached the quantum regime owing to 
a great number of technical difficulties”. A particular challenge has 
been maintaining efficient optical coupling and low-loss optics and 
mechanics in a cryogenic, subkelvin environment. The optomechanical 
system studied in this work allows large optical coupling to a high-Q, 
gigahertz-frequency mechanical oscillator, offering both efficient back- 
action cooling and significantly higher operating temperatures. As 
shown in Fig. la, the system consists of an integrated optical and 
mechanical nanoscale resonator formed in the surface layer of a 
silicon-on-insulator microchip. The periodic patterning of the nano- 
beam is designed to result in Bragg scattering of both optical and 
acoustic guided waves. A perturbation in the periodicity at the centre 
of the beam results in co-localized optical and mechanical resonances 
(Fig. 1b, c), which are coupled through radiation pressure’’. The funda- 
mental optical resonance of the structure occurs at a frequency of 
@,/2m = 195 THz (A = 1,537 nm), whereas, owing to the speed of 
sound being much less than the speed of light, the mechanical res- 
onance occurs at ,,/27 = 3.68 GHz. To minimize mechanical damp- 
ing in the structure, an external acoustic radiation shield is added in the 
periphery of the nanobeam (Fig. 1d, e). This shield consists of a two- 
dimensional ‘cross’ pattern, which has been shown both theoretically 
and experimentally to yield a substantial phononic bandgap in the 
gigahertz frequency band”. 

We use a fibre-taper nanoprobe, formed from standard single-mode 
optical fibre, to optically couple to the silicon nanoscale resonators. 
As shown in Fig. 2, a tunable laser (New Focus Velocity swept laser; 
200-kHz linewidth) is used to cool optically and transduce the mech- 
anical motion of the nanomechanical oscillator. Placing the opto- 
mechanical devices into a continuous-flow helium cryostat provides 
pre-cooling down to T, ~ 20K, reducing the bath occupancy of the 
3.68-GHz mechanical mode to m,~ 100. At this temperature, the 
mechanical Q-factor increases up to a measured value of Q,, ~ 10°, 
corresponding to an intrinsic mechanical damping rate of 
y,/2n = 35 kHz. The optical Q-factor is measured to be Q, = 4 X 10°, 
corresponding to an optical linewidth of 1/2m = 500 MHz, slightly 
reduced from its room-temperature value. 
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Figure 1 | Optomechanical resonator with phononic shield. a, Scanning 
electron microscope (SEM) image of the patterned silicon nanobeam and the 
external phononic bandgap shield. b, Enlarged SEM image of the central cavity 
region of the nanobeam. c, Top: normalized electric field (colour scale) of the 
localized optical resonance of the nanobeam cavity, simulated using the finite- 
element method (FEM). Bottom: FEM simulation of the normalized 
displacement field of the acoustic resonance (breathing mode), which is 
coupled by radiation pressure to the co-localized optical resonance. The 


In the resolved-sideband limit, where w,,/k > 1, driving the system 
with a laser (frequency, @,) tuned to the red side of the optical cavity 
(detuning, 4 =o, — @ = @,,), creates an optically induced damp- 
ing, Yom, of the mechanical resonance”. In the weak-coupling 
regime (Yow<k), the optical back-action damping is given by 
Yom = 4g°n,/ x, where n, is the average number of drive-laser photons 
stored in the cavity and g is the optomechanical coupling rate between 
the mechanical and optical modes. This coupling rate, g, is quantified 
as the shift in the optical resonance for an amplitude of motion equal to 


Figure 2 | Experimental set-up. A single, tunable, 1,550-nm diode laser is 
used as the cooling and mechanical transduction beam sent into the nanobeam 
optomechanical resonator cavity held in a continuous-flow helium cryostat. A 
wavemetre (WM) is used to track and lock the laser frequency, and a variable 
optical attenuator (VOA) is used to set the laser power. The transmitted signal 
is amplified by an erbium-doped fibre amplifier (EDFA) and detected on a 
high-speed photodetector (D2) connected to a real-time spectrum analyser 
(RSA), where the mechanical noise power spectrum is measured. A slowly 
modulated probe signal used for optical spectroscopy and calibration is 
generated from the cooling laser beam using an amplitude electro-optic 
modulator (EOM) driven by a microwave source (RFSG). The reflected 
component of this signal is separated from the input by an optical circulator 
(CIRC), sent to a photodetector (D1) and then demodulated using a lock-in 
amplifier (LIA). Paddle-wheel fibre polarization controllers (FPCs) are used to 
set the laser polarization at the input to the EOM and the input to the 
optomechanical cavity. For more detail, see Supplementary Information. 
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displacement field is indicated by the exaggerated deformation of the structure, 
with the relative magnitude of the local displacement (strain) indicated by the 
colour. d, SEM image of the interface between the nanobeam and the phononic 
bandgap shield. e, FEM simulation of the normalized squared displacement 
field amplitude of the localized acoustic resonance at the nanobeam-shield 
interface, indicating the strong suppression of acoustic radiation provided by 
the phononic bandgap shield. The colour scale represents log[x?/max(x*)], 
where x is the displacement field amplitude. 


the zero-point fluctuation amplitude (xz. = (4 / 2mm)! > where m is 
the motional mass of the localized acoustic mode and h is Planck’s 
constant divided by 2m). The optomechanical damping, which is a 
result of the preferential scattering of drive photons into the upper- 
frequency sideband, also cools the mechanical mode. For a quantum- 
limited drive laser, the phonon occupancy of the mechanical oscillator 
can be reduced from ny =kgTy/h@m_>>1 to A=ny/(1+C)+Mmnins 
where kg is Boltzmann’s constant and C = yoy/7i is the cooperativity. 
The residual scattering of drive photons into the lower-frequency 
sideband limits the cooled phonon occupancy to Min = (K/4@m)°5 
which is determined by the level of sideband resolution”’. 

The drive laser, in addition to providing mechanical damping and 
cooling, can be used to measure the mechanical and optical properties 
of the system through a series of calibrated measurements. In a first set 
of measurements, we use the noise power spectral density (PSD) of 
the drive laser transmitted through the optomechanical cavity to 
perform spectroscopy of the mechanical mode. As shown in Sup- 
plementary Information, the noise PSD of the photocurrent generated 
by the transmitted field of the drive laser with red-sideband detun- 
ing (4 =@,,) yields a Lorentzian component of the single-sided 
PSD proportional to $,(@)=/y/((@—@m)’+(y/2)"), where 
y=Vi+ Yom = yi(1 + C) is the total mechanical damping rate. For a 
blue laser detuning of 4 = —a@,,, the optically induced damping is 
negative (yom = —4¢" n,/x) and the photocurrent noise PSD is pro- 
portional to S,;(@) =(#+ 1)y/((©-@m)” + (y/2)°). Typical mea- 
sured noise power spectra under low-power laser drive (n, = 1.4, 
C= 0.27), for both red detuning and blue detuning, are shown in 
Fig. 3a. Even at these small drive powers, the effects of back-action 
on the measured spectra are evident, with the red-detuned drive 
broadening the mechanical line and the blue-detuned drive narrowing 
the line. The noise floor in Fig. 3a (shaded in grey) corresponds to the 
noise generated by the EDFA used to pre-amplify the transmitted 
drive-laser signal before photodetection, and is several orders of mag- 
nitude greater than the electronic noise of the photoreceiver and the 
real-time spectrum analyser. 

Calibration of the EDFA gain, along with the photoreceiver and 
real-time spectrum analyser photodetection gain, makes it possible 
to convert the measured area under the photocurrent noise PSD into 
a mechanical mode phonon occupancy. As described in detail in 
Supplementary Information, we perform these calibrations, along with 
measurements of low-drive-power (C < 1), radio-frequency spectra of 
both detunings (4 = +@,,), to provide accurate, local thermometry of 
the optomechanical cavity. An example of this form of calibrated mode 
thermometry is shown in Fig. 3b, where we plot the optically measured 
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Figure 3 | Mechanical and optical response. a, Typical measured mechanical 
noise spectra around the resonance frequency of the breathing mode for low 
drive-laser power (n, = 1.4). The blue and red curves correspond to the spectra 
measured with the drive laser blue- and, respectively, red-detuned by a 
mechanical frequency from the optical cavity resonance. The black trace 
corresponds to the measured noise floor (dominated by EDFA noise) with the 
drive laser detuned far from the cavity resonance. b, Plot of the measured 
(squares) mechanical mode bath temperature (T),) as a function of cryostat 
sample mount temperature (T,). The dashed line indicates the curve 
corresponding to perfect following of the cryostat temperature by the mode 
temperature (T,, = T.). ¢, Typical reflection spectrum (normalized power 
reflection) of the cavity while driven by the cooling laser (4 = ©, n- = 56, 
C= 11), as measured by a weaker probe beam at two-photon detuning 4,). The 
signature reflection dip on resonance with the bare cavity mode, highlighted in 
the inset, is indicative of EIT caused by coupling of the optical and mechanical 
degrees of freedom by the cooling laser beam. 


mechanical mode bath temperature, T,,, as a function of the cryostat 
sample mount temperature, T. (independently measured using a 
silicon diode thermometer attached to the copper sample mount). 
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Figure 3b shows that the optical mode thermometry predicts a mode 
temperature in good correspondence with the absolute temperature of 
the sample mount for T. > 50 K; below this value, the mode temper- 
ature deviates from T, and saturates to a value of T, =17.6+0.8K 
owing to thermal radiative heating of the device through the imaging 
aperture in the radiation shield of our cryostat. 

In a second set of measurements, we determine the mechanical 
damping, y, and the cavity-laser detuning, 4, by optical spectroscopy 
of the driven cavity. By sweeping a second probe beam, of frequency 
@s, over the cavity, with the cooling beam tuned to 4 = ,,, spectra 
showing electromagnetically induced transparency”® (EIT) are mea- 
sured (Fig. 3c). Owing to the high single-photon cooperativity of the 
system, an intracavity population of only n.~ 5 switches the system 
from reflecting to transmitting for the probe beam. The corresponding 
dip at the centre of the optical cavity resonance occurs at a two-photon 
detuning of 4y = @, — @ = @m and has a bandwidth equal to the 
mechanical damping rate, y;(1 + C). In Fig. 4a, we plot the measured 
mechanical linewidth as a function of intracavity photon number, 
showing good correspondence between both mechanical and optical 
spectroscopy techniques, and indicating that the system remains in the 
weak-coupling regime for all measured cooling powers. From a fit to 
the measured mechanical damping rate as a function of n, (Fig. 4a, 
dashed red line), the zero-point optomechanical coupling rate is deter- 
mined to be g/2m = 910 kHz. 

In Fig. 4b, we plot the calibrated Lorentzian noise PSD area, in units of 
phonon occupancy, as a function of red-detuned (4 = ,,,) drive-laser 
power. Owing to the low effective temperature of the laser drive, the 
mechanical mode is not only damped but is also cooled substantially. 
The minimum measured average mode occupancy for the highest drive 
power (corresponding to n. ~ 2,000) is n=0.85+0.08, putting the 
mechanical oscillator in a thermal state with ground-state occupancy 
probability greater than 50%. The dashed blue line in Fig. 4b represents 
the ideal back-action-cooled phonon occupancy estimated using 
both the measured mechanical damping rate in Fig. 4a and the 
low-drive-power intrinsic mechanical damping rate. Deviation of the 
measured phonon occupancy from the ideal cooling model is seen to 
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Figure 4 | Optical cooling results. a, Measured mechanical mode linewidth 
(squares), EIT transparency bandwidth (circles) and predicted optomechanical 
damping rate estimated using the zero-point optomechanical coupling rate, 
g/2m = 910 kHz (red dashed line). Inset, measured EIT transparency window at 
the highest cooling-beam drive power. b, Measured (circles) average phonon 
number, #, in the breathing mechanical mode at @,,/2™ = 3.68 GHz, versus 
cooling drive-laser power (in units of intracavity photons, n.), as deduced from 
the calibrated area under the Lorentzian line shape of the mechanical noise 
power spectrum. The inset spectra show the measured noise PSD (using 

Xzpp = 2.7 fm, corresponding to the numerically computed motional mass for 
the breathing mode with m = 311 fg). The dashed blue line indicates the 
estimated mode phonon number calculated from the measured optical 


damping alone. Error bars indicate estimated uncertainties as outlined in 
Supplementary Information. c, Estimated bath temperature, T,, versus cooling 
laser intracavity photon number, n.. d, Measured change in the intrinsic 
mechanical damping rate versus n, (circles). A polynomial fit to the mechanical 
damping dependence on n- is shown as a dashed line. For more details, see 
Supplementary Information. e, The measured (squares) background noise PSD 
versus drive-laser power (7), in units of effective phonon quanta. The red 
dashed curve corresponds to the theoretical imprecision assuming shot-noise- 
limited detection but all other cavity properties and optical loss as in the 
experiment. The solid black curve is for an ideal, quantum-limited continuous 
position measurement of mechanical motion. 
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occur at the highest drive powers and results from both an increase in 
the bath temperature due to optical absorption (Fig. 4c) and an increase 
in the intrinsic mechanical damping rate (Fig. 4d) induced by the 
generation of free carriers through optical absorption (Supplementary 
Information). To evaluate the efficiency of the optical transduction of 
the mechanical motion, we also plot (Fig. 4e) the measured background 
noise PSD, or imprecision level. The minimum measured imprecision, 
occurring for n.~ 500 corresponds to nimp ~ 20 in units of phonon 
quanta when referred to the peak Lorentzian level of the transduced 
mechanical motion (Supplementary Information). Comparing the 
measured imprecision with the respective theoretical imprecision levels 
for shot-noise-limited detection (Fig. 4e, dashed red curve) and ideal 
quantum-limited motion transduction (Fig. 4e, black curve) indicates 
that nEPFA~15 stems from the excess noise imparted by the EDFA 
optical ‘amplifier. The remaining n'°8s ~5 is due to optical loss of signal 
inside the cavity (11.7 dB) and in ‘the optical fibre output waveguide 
(2 dB). 

Looking forward, the optical back-action cooling and thermometry, 
as performed in this work, represents only a first step towards optical 
measurement and control of the quantum state of a nanomechanical 
object. The mechanical system, although cooled to a mode occupancy 
of less than one, is still prepared in a classical thermal state, with its 
quantum zero-point fluctuations hidden by our measurement scheme. 
However, experiments to prepare and measure non-classical quantum 
states of the mechanical system are now within reach. A basic require- 
ment for optomechanical experiments in the quantum regime is the 
ability to exchange photons with the mechanical resonator on a time- 
scale shorter than that for a single thermal phonon to enter the mech- 
anical system from the environment. The latter, called the thermal 
decoherence time, is given by tt =/Qm/kgTp, and the timescale on 
which the mechanical resonator exchanges photons with an optical 
input is Toy = 1//om. The requirement that toy < Ty is equivalent to 
the requirement for optical back-action cooling of the mechanical 
oscillator to 7 <1, and is thus realized for the optomechanical crystal 
devices reported here. This allows for optomechanical entanglement 
between light and mechanics” or quantum state transfer between 
single optical photons and mechanical phonons””’, enabling mech- 
anical systems to function as both quantum transducers® and quantum 
memory elements”. In addition, the chip-scale nature of the optome- 
chanical crystal architecture naturally lends itself to the creation of 
coupled photon-phonon circuits, facilitating not only the coupling 
of multiple mechanical and optical objects, but also allowing for the 
integration of optomechanics with other quantum systems such as 
superconducting quantum circuits’. Finally, ifa regime of strong coup- 
ling at the single-quantum level” (g/« > 1) could be reached, many 
new opportunities would be available, not least the study of nonlinear 
phononics at the single-phonon level and the generation of highly 
non-classical quantum states in mechanical or optical systems. 
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A carbon isotope challenge to the snowball Earth 


P. Sansjofre'*, M. Ader’, R. I. F. Trindade”, M. Elie*+, J. Lyons‘, P. Cartigny' & A. C. R. Nogueira? 


The snowball Earth hypothesis postulates that the planet was 
entirely covered by ice for millions of years in the Neoproterozoic 
era, in a self-enhanced glaciation caused by the high albedo of the 
ice-covered planet. In a hard-snowball picture, the subsequent rapid 
unfreezing resulted from an ultra-greenhouse event attributed to 
the buildup of volcanic carbon dioxide (CO,) during glaciation’. 
High partial pressures of atmospheric CO, (pco,; from 20,000 to 
90,000 p.p.m.v.) in the aftermath of the Marinoan glaciation 
(~635 Myr ago) have been inferred from both boron and triple 
oxygen isotopes’. These pco, values are 50 to 225 times higher than 
present-day levels. Here, we re-evaluate these estimates using paired 
carbon isotopic data for carbonate layers that cap Neoproterozoic 
glacial deposits and are considered to record post-glacial sea level 
rise’. The new data reported here for Brazilian cap carbonates, 
together with previous ones for time-equivalent units**, provide 
Pco, estimates lower than 3,200 p.p.m.v.—and possibly as low as 
the current value of ~400 p.p.m.v. Our new constraint, and our re- 
interpretation of the boron and triple oxygen isotope data, provide a 
completely different picture of the late Neoproterozoic environ- 
ment, with low atmospheric concentrations of carbon dioxide and 
oxygen that are inconsistent with a hard-snowball Earth. 

Thousands of carbon isotope data have been reported for 
Neoproterozoic successions in the past decade*®, yet the full palaeo- 
environmental significance of these data is still largely unappreciated. 
In particular, coupled carbon isotope data from organic carbon and 
carbonate have the potential to solve the longstanding conundrum of 
carbon dioxide concentrations in the aftermath of Neoproterozoic 
glaciations. This is possible because the difference between the carbon 
isotope ratio for carbonates (8'°C.arp) and that for associated organic 
matter (Ol Core) aa tee een depends strongly on the concentration 
of dissolved CO, in the ocean ([CO3],q) ref. 9), which can be related to 
Pco, (ref. 10). This is illustrated by the decrease of I Cesty ote in the 
past 20 Myr to today’s value of ~22%o (ref. 9), which accompanied the 
drawdown of atmospheric pco, to the pre-industrial value of 
280 p.p.m.v.. In contrast, earlier in the Phanerozoic eon, ag oer 
mostly remained in the range 28-32%o (ref. 9), except for brief episodes 
of lower La Oe ren such as those reported in the upper Ordovician"! 
and at the Permo-Triassic boundary”. 

We obtained paired 8'°C..,4 and ae values for post-glacial cap 
carbonates from western Brazil. Cap carbonates are the stratigraphic 
horizon marker defining the base of the Ediacaran period (635- 
542 Myr before present). They were supposedly deposited during the 
post-glacial sea level rise, in a supersaturated ocean and ultra- 
greenhouse climate resulting from high atmospheric CO, levels fol- 
lowing a snowball Earth’. The studied cap carbonates come from the 
southeastern margin of the Amazonian craton, away from the meta- 
morphic Paraguay belt'*. They form a transgressive systems tract 
directly above diamictites of the Puga Formation, starting with pink 
stromatolitic dolostones of the Mirassol d’Oeste Formation (~13 m 
thick) deposited in shallow oxic waters, overlain by dark grey lime- 
stone and shale of the Guia Formation (~50 m thick) deposited below 
storm wave base. These strata have been correlated to the Marinoan 


event on the basis of their "°C. of about —5%o, °’Sr/*°Sr of ~0.7078 
and Pb-Pb carbonate age of 627 + 32 Myr (Fig. 1a; Supplementary 
Information section 2). The measured O Coes values (n = 29) are 
homogeneous, with an average value of —27.3 + 0.5% (Supplemen- 
tary Table 1). 8'°C..,, values are also very homogeneous, averaging 
—A4.8 + 0.6%o, consistent with previous results on the same unit and on 
other cap carbonates worldwide (Supplementary Fig. 8). The resulting 
gl Goer ae values average to 22.7 + 0.8%o (Fig. 1a). 
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Figure 1 | Isotope and age data for cap carbonates. Paired carbon isotope 
data (8° Cog and 5°C.,,5) for cap carbonate successions from Brazil 
(Amazonia), North China and South China (this work and refs 4-8), along with 
geochronological data and *’Sr/*°Sr ratios (see Supplementary Information 
sections 2.1 and 2.3 for references). Potentially altered °’Sr/*°Sr values for 
Zhamoketi cap dolostones are shown in grey. 
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All other Marinoan cap carbonates for which paired carbon isotope 
data are available present similarly low igs Orr ane values (Fig. 1). In 
North China, the 10-m-thick Zhamoketi cap carbonate, deposited 
above the Tereeken diamictite*, shows a very stable NYC don signal 
of 23.6 + 1.5%o0 (n = 52, Fig. 1b). In South China, gs Oe data for 
the 3-6-m-thick Doushantuo cap carbonates are available for six 
sections located along a north-south transect across the Yangtze 
platform®*. These sections reveal again a low a One eee signal, with 
an average of 22.7 + 1.4%0 (n = 105, Fig. 1c). Other cap carbonate 
successions also show systematically low values: the Noonday dolomite, 
Death Valley (average 19.1 + 2.7%o, n = 9)'*, the Maieberg Formation, 
Namibia (average 18.7 + 0.8%, n = 8)'° and the Tepee dolostone, 
Western Canada (single value of 22.4%o)'*. These sections are not con- 
sidered further, owing to their much smaller data sets. 

We have evaluated the extent to which post-depositional processes 
may have overprinted 83 Ccarp (ref. 17) and o Ce (ref. 9) in these 
sections. Available indicators (petrography, Mn/Sr and 530 carb: 
Supplementary Information sections 3 and 6, respectively) and the 
smooth chemostratigraphic trends observed in different successions 
with similar 8'°Cab (Fig. 1) argue against a significant diagenetic 
overprint for the 5'°C..,, data. As for as oe isotope effects associated 
with early diagenesis are minor and taken into account in the 4, 
parameter of equation (1) below (Supplementary Information section 
5.1). In addition, molecular organic geochemistry and Rock-Eval 
pyrolysis data for Brazilian cap carbonates (Supplementary Informa- 
tion section 4) show that the organic matter experienced low thermal 
maturity and only moderate oxidative weathering, the o Cay signal 
being thus not significantly affected by these processes (Supplemen- 
tary Information section 5). After screening for post-depositional 
effects, 26 data (out of 186) were eliminated, and a grand mean 
tee ae ee of 23.2 + 0.9%o was calculated for the three cap carbonate 
sequences (Fig. 1). The strong similarity in paired carbon isotope 
values among the three sequences constitutes the most compelling 
argument against a first-order diagenetic control on their low 
gg Ser een 

Anomalous AC sacoig values in Neoproterozoic successions were 
previously interpreted as the result of organic matter input from a large 
dissolved organic carbon (DOC) pool in the ocean®. A large DOC 
reservoir can accumulate only in anoxic waters and would buffer the 
oes thus decoupling the 8° Carb and O Cie signals. However, 
several authors have recently challenged this hypothesis, on the basis of 
mass balance calculations®'*. Moreover, to account for the consistently 
low AC aien values observed here, cap carbonates must form 
exclusively in DOC-rich anoxic waters. This contradicts geochemical 
and magnetic data showing that the basal haematite-bearing cap 
dolostones were deposited in oxic surface waters”. 

We thus consider that the cap carbonates and associated organic 
matter originated from the same surface water, so their a Oe 
can be expressed as” 


NPC adi 2ais = &— Az + A carb (1) 


where é, is the photosynthetic fractionation factor between dissolved 
CO, and organic matter and depends on [CO3],,; 42 is the potential 
increase in 8 Ong during early diagenesis, set here at +1.5%o (see 
Methods); and 4..:1 is the isotopic depletion of dissolved CO, relative 
to carbonate. 4.,,4 is temperature-dependent and can be estimated 
from the carbon isotope fractionation between carbonate species 
(Methods). 

The range of ‘normal’ ¢, values recorded for most of the Phanerozoic 
eon’ (Fig. 2b) matches the low LN cee ae of cap carbonates 
(22.3%0 < AC sae = 24.1%0) only for temperatures higher than 
80°C. However, the activase of the photosynthetic enzyme (Rubisco) 
is ineffective above 45°C (ref. 20), and only non-phototrophic 
hyperthermophiles can survive in extremely warm environments”. 
As temperatures above 80°C would be at odds with the presence of 
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Figure 2 | Relationship between photosynthetic fractionation factor (¢,), 
temperature and CO, concentrations. a, Dissolved CO, in the ocean 
({COz],q) and atmospheric pco, versus é, for three photosynthesizer growth 
rates (u, in d~'), obtained from the relationship [COz]ag = 182 u(V/S)/(25.3é,), 
and assuming chemical equilibrium between ocean and atmosphere. In this 
equation”, 25.3%bo is the isotope effect associated with carbon fixation, and the 
coefficient 182 is partly dependent on cell membrane permeability. b, ¢,-T 
interval compatible with AP Caen values of cap carbonates 

(22.3 < A’ Ceatiorg <24.1%0). At 45 °C, average MI Csi ota implies an ¢, of 
18.8%0, and hence a [CO], of 55.9 jumol | 5 corresponding to an atmospheric 
Pco, of 3,200 p.p.m.v.. See text and Methods for details. The shaded area in 

b shows the range of ¢, observed during most of the Phanerozoic eon. 


22,23 


autotroph and heterotroph fossils within and above glacial deposits 
(Supplementary Information section 4), the low BC ctte-ny recorded 
in cap carbonates instead implies low ¢, values (from 18.8%o at 45 °C to 
16.9%o at 25 °C), similar to those of today™*. 

The low é, values of cap carbonates are consistent with a low 
[COz],q during their deposition, regardless of the speciation of carbon 
(CO, or HCO; ) and of its uptake mechanism. In some modern 
settings, ¢, is shifted towards lower values, probably because the system 
is significantly driven by the active uptake of HCO3 (refs 25, 26). But 
controlled experiments show that the relative contribution of this 
carbon uptake mechanism decreases significantly as [COg]aq 
increases”. For instance, it did not influence Ep for most of the 
Phanerozoic, when [CO ],q was probably much higher than today’. 
Therefore, this mechanism could account for the systematically low ¢, 
deduced here only if [CO ],, during cap carbonate formation were 
low—probably below the present-day range of 10-20 umolkg *. In 
the case of dissolved CO, uptake, Ep values can be directly related to 
[CO2]aq by empirical equations, which include physiological para- 
meters such as the ratio of cellular volume to surface area, (V/S), 
and the growth rate (1) of photosynthesizers. We used the [CO3]aq-ép 
equation from ref. 27 for modern unicellular algal species (Fig. 2a); other 
equations give similar or lower estimates (see Methods). Figure 2a shows 
three curves of [CO ],q as a function of ¢, for different growth rates and 
for a V/S of 1m. Taking a maximum temperature of 45°C and a 
conservative growth rate of 2d~', which is rarely attained even in 
modern high-productivity settings (Supplementary Information section 
1), we estimate an absolute [CO.],q upper limit of 56 umol kg. 
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Theoretically, much higher [CO ],q values could also produce the 
low é, observed in cap carbonates, but only for extremely high cellular 
volume and V/S ratios. We note, however, that even if large organisms 
did flourish just after the glaciation, the increase in cell volume of the 
biomass would probably be counterbalanced by a large decrease in 
growth rates. Growth rate and cell volume are related by a power- 
law, 1 = aV” (a being a normalization factor and b a size-scaling 
exponent usually taken as —0.25, ref. 28), which would partly buffer 
the [CO2]aq estimates for higher cell volume and consequently for 
higher V/S ratios (Supplementary Information section 1). 

Assuming that atmosphere and ocean are in chemical equilibrium, 
our [CO.],q estimates can be converted into an atmospheric pco, of 
3,200 p.p.m.v. for the upper estimate (T = 45 °C and = 2.0d_') and 
~400 p.p.m.v. for a temperature and growth rate similar to present- 
day values (T= 25°C and n=0.5 d—') (Methods). These low Pco, 
estimates are in apparent contradiction to the extremely high estimates 
of 20,000 to 90,000 p.p.m.v. derived from boron isotope? (5''B) and 
triple oxygen isotope’ (A'’O) data. If correct, the whole data set can 
nonetheless be reconciled into a new environmental picture for the 
early Ediacaran period. 

Boron isotopes, provided that the 8!!B of sea water is known, can be 
used as a proxy for seawater pH, from which atmospheric pco, can be 
inferred. Kasemann et al.” have reported a ~4%o decrease in 8''B in 
the Marinoan cap carbonates of the Maieberg Formation (Namibia), 
which they interpret as resulting from a seawater pH decrease from 8.8 
to <7, in response to a transfer of CO, from a high-pco, atmosphere 
into the surface ocean. As the boron isotopic composition of 
Neoproterozoic oceans remains unknown, however, only variations 
in pH can be obtained from 5''B values’, not the absolute seawater pH. 
Therefore, the reported 8"4B decrease is also compatible with low 
atmospheric pco,, assuming a higher seawater pH. This new proposal 
finds additional support from the global presence of shallow-water 
marine carbonate in the glacial aftermath, indicating high levels of 
carbonate saturation, which require elevated seawater pH. 

Triple oxygen isotopes measured on sulphates can also be used to 
constrain atmospheric pco, (ref. 3). The acquisition of a negative A'7O 
signature of sulphate occurs in two steps: acquisition of a negative 
A’’0 by the oxygen in the atmosphere, followed by transfer of the 
anomaly to sulphate during oxidative alteration of pyrite in exposed 
terrestrial sediments. The relationship between pco, and A’’0(0,) is 
given by 


A'70Q(O;) oc — A"70(CO> trop) PL 0, (2) 
Po, 
neglecting multiplicative constants, where CO> trop is CO2 crossing the 
tropopause from the stratosphere, with A’ OWCOses) =~ 1% in the 
modern atmosphere; and To, is the residence time of O> with respect to 
photosynthesis and respiration (assumed to be in steady state), which 
is ~1,200 years in the modern atmosphere. 

Recently, Bao et al.> reported a negative A'’O anomaly (approxi- 
mately —0.7%o) in barites intercalated in Marinoan cap carbonates, and 
interpreted it as a proxy for high atmospheric pco,. Abiotic laboratory 
experiments and experiments with iron-oxidizing organisms have 
shown that 8-16% of sulphate oxygen is incorporated from atmo- 
spheric O) (ref. 29), with the remainder from water. Therefore, using 
a 10% value for atmospheric O, incorporation in sulphate, the observed 
A’’O(barite) anomaly of —0.7%o implies a A'’0(O;) of about —7%p. 
According to equation (2), A’’0(O>) depends on two quantities: pco, 
and the ratio po,/to,, which is the photosynthetic O, flux. In their 
model, Bao et al? assumed an atmospheric po, of ~20% at the end 
of the Marinoan glaciation, thus resulting in a high pco, of 
~10,000 p.p.m.v. (Fig. 3, after ref. 3). We interpret these data otherwise, 
considering lower O) fluxes (that is, lower po, and/or higher to, ), in 
agreement with available evidence for low po, levels in the early 
Ediacaran (0.2-10%)*°"". In such a case, for 1% po, and a modern 
TO,» a Pco, as low as ~ 600p.p.m.v.—in the range of pco, values 
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Figure 3 | Relationship between pco, (p.p.m.v.) and A'’O(O,) for 20% and 
1% O2, assuming a modern value for O, residence time of 1,200 yr. The 
linear relationship between pco, and A'’O(O,) (dashed curves) breaks down 
for large A’’O(O,) values’. The curve for 1% O, implies a factor of 20 reduction 
in the photosynthetic O, flux relative to the modern oceanic O, flux. The 
hatched region shows the range of maximum A’’O anomalies measured in 
Marinoan barites and carbonate-associated sulphates (Supplementary 
Information section 7). Both 1% and 20% O, can explain the measured A'70, 
but imply greatly different values for pco,. 


predicted here from paired carbon isotopic data—would be sufficient 
to produce the observed A'’O(barite) of about —0.7%o (Fig. 3). 

In conclusion, the low BOG is values reported here for 
Marinoan cap carbonates not only provide a lower pco, estimate than 
previously thought, but also allow the re-interpretation of 5'’B and 
A’’(O isotopic data. The new environmental picture deduced from this 
integrated interpretation, with low atmospheric pco,, high seawater 
pH and low atmospheric po, (associated with low O, fluxes), repre- 
sents a substantial challenge to the hard-end-member snowball Earth 
picture’. 


METHODS SUMMARY 


Samples were ground in an agate mortar, then sieved to ensure a grain size of 
<140 um. For carbonate isotope analyses, we used 100% H3PO, to extract CO, 
successively from calcite and dolomite, in a two-step dissolution process (4h at 
25 °C for calcite, then 2 h at 80 °C for dolomite). We measured carbon and oxygen 
isotope compositions of the evolved CO) using a gas chromatograph coupled to a 
GV Instruments Analytical Precision 2003 mass spectrometer. The external repro- 
ducibilities (1¢) for 5'°C.arp and 8'°O a: measurements are 0.1%o and 0.2%o, 
respectively. For C and N quantification, and organic carbon analysis, samples 
were decarbonated in 6 N HCl overnight at room temperature, followed by 2h at 
80 °C. Residues were washed with distilled water, centrifuged and dried at 50 °C. 
Samples of decarbonated powder (10-60 mg) were loaded into quartz tubes along 
with copper oxide wires. The tubes were connected to a vacuum line and sealed 
under secondary vacuum (<10 > mbar), then heated at 950 °C for 6h. The result- 
ing CO2 and N> were purified on a vacuum line and manometrically quantified 
using a Toepler pump. Total organic carbon content, nitrogen content and hence 
C/N are deduced from the CO, and N> quantification with a precision of +10% 
relative to the measured value. The carbon isotope composition was measured 
using a dual-inlet Thermo Finnigan Delta+XP mass spectrometer. The repro- 
ducibility of the 38°Corg measurement is +0.1%o (1a). All isotopic results are given 
in 6 notation calibrated to V-PDB (Vienna Pee Dee Belemnite). Rock-Eval ana- 
lyses and organic geochemistry were performed using standard methods, 
described in the online Methods section. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Pco, reconstruction. As explained in the main text, ¢, is calculated from 
A’ Cearb-org Knowing Aca, and A (fixed at 1.5; Supplementary Information 
section 5.1). Acarp can be decomposed as the sum of the isotope fractionation 
factor between carbonate and HCO;  (AC.ab-rco3—) and the one between 
HCO3 and COzaq (A? Cyco3-co2aq)- Three sources of A?C.arb-tico3— are available: 
the A’ Coaco3-Hco3- for inorganic calcite precipitation of 0.9%o at ambient tem- 
perature® the A'’Caotomite-HCo3—» Which ranges from 3.3%o to 1.2%o at ambient 
temperature™; and the A™’Ceacite-co3— of —1.2%o, estimated in ref. 9 from the 
difference between modern carbonate sediments and surface seawater HCO3 .We 
chose this smallest value of A'?Ceaco3-Hc03~» to ensure that the ultimately derived 
Pco, estimate represents an upper limit. Since A®Ceaco3-Hco3— May change 
slightly with temperature, we used the variation of 0.01%0°C_ | also proposed in 
ref. 9. In contrast, A®Crc03-co20q is strongly temperature-dependent and is cal- 
culated using the formula A? Cyco3-Cco2aq = 9866/(T+273)—24.12 (ref. 34). 

[COy]aq is calculated from ¢, using the relation obtained by Popp et al.”” from 
cultures of unicellular algae (Fig. 2). Other [CO2],,-é equations have been deter- 
mined but most of them were obtained for single species at a local scale (in lakes) 
and do not take into account variations in physiological parameters**. Equations 
based on mixed phytoplankton populations**’’, which take into account V/S and 
Ls yield [CO ],q estimates inferior to those obtained using Popp et al.’s equation. 
Hence, the choice of this equation provides an upper estimate of [CO2]aq-. 
Moreover, the equation used here is more compatible with the algal signature of 
molecular organic data (Supplementary Information section 4). 

Finally, [CO2],q is converted into pco,, assuming that ocean and atmosphere 
were at equilibrium as they are today”. Although episodic events of high growth 
rate can drive the ocean-atmosphere system out of equilibrium owing to the 
rapid drawdown of ocean CO, by photosynthesizers, they are typically local 
and occur on short timescales. The ubiquitous and homogeneous isotopic record 
of cap carbonates points instead to a global and persistent process. We used 
Henry’s law: peo, = [COz]aq/ko. The Henry constant (ko) is expressed as a func- 
tion of temperature (T) and salinity (S)** as follows: In(ko) = 9345.17/ 
T — 60.2409 + 23.3585 In(T/100) + S [0.023517 — 0.00023656T + 0.0047036 (T/ 
100)°]. 

We have tested salinity values from 15 to 50 uM (modern value being 35 1M). In 
the main text, we present only the value corresponding to the highest pco, estimate 
obtained with a high salinity of 50 uM. 

Carbon isotope analysis, C and N quantification. Samples were ground in an 
agate mortar, then sieved to ensure a grain size of <140 um. Powdered samples 
were reacted with 100% H3PO, at 25 °C for 4h to extract the CO; from calcite, and 
then at 80°C for 2h to extract CO, from dolomite. Carbon and oxygen isotope 
compositions were measured using a helium continuous flow mass spectrometer 
(AP 2003). Isotopic compositions are given in the 6 notation relative to the V-PDB 
(Vienna Pee Dee Belemnite). The external reproducibilities (10) for 5'°C..,, and 
5'8O carb measurements are 0.1%o and 0.2%o, respectively. 3'3C-arp Values measured 
on calcite and dolomite were similar, except in the lower part of the dolomite 
where calcite 5'°C.a, values were lower by a maximum of 2%. Because 
petrographic observations of these dolostones attest to the secondary character 
of calcite (Supplementary Information section 3), only the dolomite 8 Cearb 
values were considered. For C and N quantification, and organic carbon analysis, 
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samples were decarbonated in 6N HCl overnight at room temperature, followed 
by 2 hat 80 °C. Residues were washed with distilled water, centrifuged and dried at 
50 °C. Samples of decarbonated powder (10-60 mg) were loaded into quartz tubes 
along with copper oxide wires. The tubes were connected to a vacuum line and 
sealed under secondary vacuum (<10 ° mbar), then heated at 950 °C for 6h. The 
resulting CO, and N, were purified on a vacuum line and manometrically quan- 
tified using a Toepler pump. Total organic carbon content, nitrogen content and 
hence C/N are deduced from the CO, and N, quantification with a precision of 
+10% relative to the measured value. The carbon isotope composition was mea- 
sured using a dual-inlet Thermo Finnigan Delta+XP mass spectrometer at the 
IPGP, and is expressed in 6 notation calibrated to V-PDB (Vienna Pee Dee 
Belemnite). The reproducibility of the 8 Csr measurement is +0.1%o (1a). 
Molecular organic geochemistry. Analyses were performed at Henri-Poincaré 
University of Nancy. The soluble organic matter was extracted with dichloro- 
methane at 100 bar and 80°C using an Accelerated Solvent Extractor ASE 200. 
A blank was performed before each extraction. Two extraction cycles were per- 
formed to ensure a complete extraction. Elemental S was removed by introducing 
HCI activated Cu chips in vials containing the solvent and the extract. 
Dichloromethane was evaporated using a Zymark TurboVap LV. The extracted 
organic matter was fractionated into aliphatic and aromatic hydrocarbons on a 
silica column by successive elution of pentane and pentane/dichloromethane (65/ 
35). Aliphatic hydrocarbons were diluted in hexane (4 mg ml — 1) and analysed ona 
HP5890 Serie II Gas chromatograph coupled with a HP5971 Mass Spectrometer 
(GC-MS) following the procedure described in ref. 23. 

Rock-Eval analyses. Because a TOC content of 0.3% is required for a meaningful 
determination of Rock-Eval parameters, the bulk-rock organic matter was first 
concentrated by HF and HCl mineral dissolution using a kerogenatron. Analyses 
were carried out on the organic concentrate using a Rock-Eval 6 Turbo device at 
the Institut Frangais du Pétrole following the classical methodology’. The Rock- 
Eval parameters used were the hydrogen index (HI, mg HC per g TOC) and 
oxygen index (OI, mg CO) per g TOC) also described in ref. 39. 
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Mirror extreme BMI phenotypes associated with 
gene dosage at the chromosome 16p11.2 locus 


A list of authors and their affiliations appears at the end of the paper 


Both obesity and being underweight have been associated with 
increased mortality'*. Underweight, defined as a body mass index 
(BMI) = 18.5kg perm” in adults and < —2 standard deviations 
from the mean in children, is the main sign of a series of hetero- 
geneous clinical conditions including failure to thrive’, feeding 
and eating disorder and/or anorexia nervosa®’. In contrast to 
obesity, few genetic variants underlying these clinical conditions 
have been reported*”. We previously showed that hemizygosity of a 
~600-kilobase (kb) region on the short arm of chromosome 16 
causes a highly penetrant form of obesity that is often associated 
with hyperphagia and intellectual disabilities’’. Here we show that 
the corresponding reciprocal duplication is associated with being 
underweight. We identified 138 duplication carriers (including 
132 novel cases and 108 unrelated carriers) from individuals 
clinically referred for developmental or intellectual disabilities 
(DD/ID) or psychiatric disorders, or recruited from population- 
based cohorts. These carriers show significantly reduced postnatal 
weight and BMI. Half of the boys younger than five years are 
underweight with a probable diagnosis of failure to thrive, whereas 
adult duplication carriers have an 8.3-fold increased risk of being 
clinically underweight. We observe a trend towards increased 
severity in males, as well as a depletion of male carriers among 
non-medically ascertained cases. These features are associated with 
an unusually high frequency of selective and restrictive eating 
behaviours and a significant reduction in head circumference. 
Each of the observed phenotypes is the converse of one reported 
in carriers of deletions at this locus. The phenotypes correlate with 
changes in transcript levels for genes mapping within the duplica- 
tion but not in flanking regions. The reciprocal impact of these 
16p11.2 copy-number variants indicates that severe obesity and 
being underweight could have mirror aetiologies, possibly through 
contrasting effects on energy balance. 

Copy-number variants (CNVs) at the 16p11.2 locus have been asso- 
ciated with cognitive disorders including autism (deletions) and schizo- 
phrenia (duplications)'*”, conditions that have been suggested to lie at 
opposite ends of a single spectrum of psychiatric phenotypes’*. We and 
others have reported that a deletion of this region spanning 28 genes 
(Supplementary Table 1) increases the risk of morbid obesity 43-fold 
(Supplementary Fig. 1)'°. We hypothesized that the reciprocal 
duplication, with its resulting increase in gene dosage, may influence 
BMI in a converse manner. The duplication was identified in 73 out of 
31,424 patients with DD/ID, a frequency consistent with previous 
reports’’ (Table 1). Four additional cases were identified among 1,080 
patients affected by bipolar disease or schizophrenia. Compared to its 
prevalence in seven European population-based genome-wide asso- 
ciation study (GWAS) cohorts'*"* (31 out of 58,635 individuals), the 
duplication was significantly more frequent in both the DD/ID cohorts 
(P = 4.23 X 10 '°; odds ratio = 4.4, 95% confidence interval = 2.9- 
6.9) and the psychiatric cohorts (P = 3.6 X 10 °; odds ratio = 7.0, 
95% confidence interval = 1.8-19.9) (Table 1), strengthening previous 
reports of similar associations'*’’. Our data do not support a two-hit 
model’? for the effects of 16p11.2 duplications or deletions (Supplemen- 
tary Text and Supplementary Table 2). 


We compared available data on weight, height and BMI for 106 
independent duplication carriers (including published cases) to data 
for reference populations matched for gender, age and geographical 
location (Table 2, Methods and Supplementary Tables 3 and 4). The 
duplication was strongly associated with lower weight (mean Z-score 
—0.56; P=4.4 xX 10 *) and lower BMI (mean Z-score —0.47; 
P=2.0 X10 *) (Table 2 and Supplementary Table 5). Birth para- 
meters (n = 48) were normal, indicating a postnatal effect. Adults 
carrying the duplication had a relative risk of being clinically under- 
weight (BMI <18.5) of 8.3 (95% confidence interval = 4.4-15.9, 
P=1.53 X10 '°) (see Methods). Concordantly, none of the 3,544 
patients in our obesity cohorts'®!” carried the duplication (Table 1). 

To investigate these associations further, we carried out separate 
analyses of carrier patients (DD/ID and psychiatric) and non-medically 
ascertained carriers (population-based cohorts, plus 11 transmitting 
parents and three other affected first-degree relatives for whom data 
were available) (Table 2). Each category had significantly lower weight 
and BMI, with similar effect sizes. However, the proportion of under- 
weight cases (BMI = —2:.d.) was higher in the first group than in the 
second group (17 out of 76 compared to 2 out of 40; P = 0.017). Note 
that the impact of the duplication on underweight status might be 
underestimated here owing to prescription of antipsychotic treatments 
that are often associated with weight gain’? (Supplementary Table 6). 

Having demonstrated an association of the duplication with being 
underweight, we investigated the implications of gender for the resulting 
phenotypes (Fig. 1, Supplementary Fig. 2 and Supplementary Table 7). 
In DD/ID patients, the impact of the duplication on being underweight 
is stronger in males; the effect in females is in the same direction, but is 
smaller and not statistically significant (Table 2). A similar and signifi- 
cant difference (P = 0.0168) was observed in adult carriers (all groups 
combined): the relative risk of being underweight for males is 23.2 
(95% confidence interval = 9.1-59.3, P= 4.6 X 10 *'); for females it 
is only 4.7 (95% confidence interval = 1.9-11.8, P=9.9 X 10“). A 
gender bias was also observed in the ascertainment of DD/ID duplica- 
tion carriers, in which we have an excess of males (51 males:33 females, 
P = 0.044). By contrast, carriers from the general population showed a 
strong overrepresentation of females (10 males:21 females, P = 0.035) 
(Supplementary Text). A similar bias was observed among transmit- 
ting parents (7 males:23 females, P = 5.53 x 10“). Thus, there is an 
overrepresentation of males in the medically ascertained group, and a 
depletion in the non-medically ascertained one. We suggest that males 
may be more likely than females to present severe phenotypes, and that 
this may account for the gender bias because severely affected males 
may be less likely to be recruited to adult population cohorts or to be 
reproductively successful. 

As previously reported”’, the duplication was also associated with 
reduced head circumference (mean Z-score —0.89; P= 7.8 X 10° °) 
(Fig. 1), 26.7% presenting with microcephaly (head circumference = 
—2 s.d.), whereas carriers of the reciprocal deletion had an increased 
head circumference (mean Z-score +0.57; P= 1.79 X 10 °) (Sup- 
plementary Fig. 3 and Supplementary Table 8): an additional instance 
of a mirror phenotype associated with reciprocal changes in copy 
number at this locus. Notably, head circumference Z-scores correlate 
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Table 1 | 16p11.2 rearrangements in cases and controls 


Ascertainment Cohorts Duplication Deletion Total 
n Py n Py 

Neuro- Unspecified DD/ID* from 28 cytogenetic centres te 113 30,323 

developmental ADHD#, deCODE 0 1 591 

disorders Childhood autismt, deCODE (0) 2 159 
Childhood autism spectrum disordert, deCODE 1 3 351 
TOTAL 73 4.23 x 10-78 119 543x103? 31,424 
Rearrangement frequency (95% Cl) 0.23% (0.18-0.29) 0.38% (0.31-0.45) 

Family history First-degree relatives of probands 30 35 43/62 || 

Adult psychiatric Schizophrenia, deCODE 0 1 657 

symptoms Bipolar disease, Rouen 1 @) 56 
Schizophrenia, schizo-affective, Rouen 3 267 
TOTAL 4 3.57 x 10-3 1 3.78 x 107-7 1,080 
Rearrangement frequency (95% Cl) 0.37% (0.01-0.73) 0.09% (0-0.27) 

Underweight Eating disorder, Spain 18 e) 441 

Obesity Obesity, Spain 0 2 653 
Adult obesity, France (0) 4 705 
Childhood obesity, France & UK (0) 7 1,574 
Obesity bariatric surgery, France 0 2 41 
Obesity discordant siblings, Sweden 0 2 159 
Obesity and cognitive delay, France & UK 0 9 312 
TOTAL 0 4.21x 107 26 2.52 x 10-79 3,544 
Rearrangement frequency (95% Cl) 0) 0.73% (0.45-1.01) 

Population-based NFBC1966 Finnish 4 3 5,319 

cohorts CoLaus Swiss 5 0) 5,612 
EGCUT Estonian 2 1 2,994 
deCODE Iceland 17 18 36,601 
SHIP Germany 1 2 4,070 
KORA F3+F4 Germany 2 1 3,458 
Paediatric family study 0 6) 581 
TOTAL 31 25 58,635 


Rearrangement frequency (95% Cl) 


0.05% (0.03-0.07) 


0.04% (0.03-0.06) 


Cl, Confidence interval; ADHD, attention-deficit hyperactivity disorder. *Not a disease-specific cohort. Detailed distribution is provided in the online methods. }Fisher’s exact test, compared to the combined 
frequency in general population groups. {There was no overlap between these 3 cohorts. 8Atypical duplication (see Supplementary Fig. 5). || Total number of parental pairs tested for duplication/deletion. 13 out of 


43 duplications and 27 out of 62 deletion cases were de novo. 


positively with those of BMI in carriers of both the duplication 
(rho = 0.37; P=2.65X10 °) and the deletion (rho = 0.42; 
P=19X10 °) (Supplementary Methods). This indicates that head 
circumference and BMI may be regulated by a common pathway, or 
that a causal relationship exists between these two traits in these 
patients. Alternatively, the two phenotypes may arise from distinct 
genes and pathways. A full list of malformations and secondary phe- 
notypes reported in duplication carriers ascertained for DD/ID is 
available in Supplementary Table 9. 

In view of the importance of modified eating behaviours in obesity 
and being underweight, the clinical reports of duplication carriers were 
screened for evidence of such modified behaviours. In 11 out of 77 
clinically ascertained cases, clinicians had spontaneously reported 
low food intake and selective and restrictive eating behaviour, again 
mirroring one of the phenotypes—hyperphagia—seen in deletion 
carriers'° (Supplementary Table 6) and indicating that the duplication 
may increase the risk of eating disorders. Consequently, we carried out 


multiplex ligation-dependent probe amplification (MLPA, Supplemen- 
tary Table 10) to screen for 16p11.2 rearrangements in 441 patients 
diagnosed with eating disorders, including anorexia nervosa, bulimia and 
binge eating disorder (Table 1 and Supplementary Text). No duplications 
of the entire region were identified, but one out of 109 anorexia nervosa 
patients carried an atypical 136-kb duplication that encompasses the 
sialophorin (SPN) and quinolinate phosphoribosyltransferase (QPRT) 
genes (Supplementary Fig. 4). This single, smaller duplication does not 
allow us to draw any firm conclusions, but together with other atypical 
rearrangements, it may, in the future, be essential for establishing the 
roles of the 28 genes within the region. 

Large genomic structural variants are known to affect the expression 
of genes not only within the affected region but also at a distance”. 
Therefore, it is possible that the phenotypes observed in 16p11.2 dele- 
tion and duplication individuals are due to effects on the expression of 
genes mapping outside the rearranged interval, rather than to gene 
dosage within the 600-kb deletion or duplication. We measured 


Table 2 | Comparisons of the height, weight and BMI distributions in duplication carriers and controls. 


Combined+ DD/ID or psychiatrict Non-medically ascertainedt 
Strata Mean Z-score P-value n* Mean Z-score P-value n* Mean Z-score P-value n* 
BMI All -0.47 2.0 x 10-3 102 —0.56 4.1x 10-3 76 -0.45 6.0 x 10-3 40 
ale —0.54 2.1 x 10-2 52 —0.71 1.3 x 107-7 43 —0.31 2.0 x1071 14 
Female -0.4 1.8 x 10°? 50 —0.37 83 x10? 33 -0.52 4.2 x 10-3 26 
Weight All —0.56 4.4 x 10-4 104 —0.65 1.3 x 10-3 78 —0.61 3.0x 10-3 40 
ale —0.64 5.8 x 1073 53 -0.79 4.4x 10-3 44 -0.57 88x10? 14 
Female -0.47 1.7 x 10°? 51 -0.47 6.5 x 10°? 34 —0.63 8.6 x 10-3 26 
Height All —0.24 4.8 x 10-2 103 —0.33 3.6 x 10-7 77 —0.15 1.8x107! 40 
ale -0.34 4.5 x10? 52 -0.4 4.6 x10? 43 —0.29 1.21071 14 
Female —0.14 2.6 x1071 bil —0.24 2.1.<10°* 34 —0.07 3.7 x10} 26 
The available BMI, weight and height data for duplication carriers were transformed to Z-scores using gender- and age-matched reference populations, and one-tailed t-tests were carried out to determine whether 
the mean Z-scores deviated from zero. Significant differences were identified by reference to cutoffs controlling the false discovery rate at 5% (see Methods): BMI, 0.022; weight, 0.032; height, 0.025. Significant 
results are indicated in bold. Data were not available for all subjects. *Relatives of probands were excluded as required, to avoid including more than one member of the same family in a single analysis. +Including 
24 cases from the literature (Supplementary Table 3). {Population-based cases and first-degree relatives of probands. 
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Figure 1 | Effect of the chromosome 16p11.2 duplication on BMI and head 
circumference. Z-score values of BMI and head circumference in carriers of 
the 16p11.2 duplication, stratified by gender and age group. The most severe 
effect is observed in children at 0-5 years of age. Boxplots represent the fifth, 
twenty-fifth, median, seventy-fifth and ninety-fifth percentile for each age 
group. Light grey and dark grey backgrounds represent = —2 and = —3 s.d., 
respectively, corresponding to the WHO definition of moderately and severely 
underweight”. BMI is decreased in adolescent and adult females. 


relative transcript levels of 27 genes mapping within or near to the 
rearrangement, using lymphoblastoid cell lines (Supplementary 
Tables 1 and 11): six from deletion carriers, five from duplication 
carriers and ten from gender- and age-matched controls (Supplemen- 
tary Table 12). Expression levels correlated positively with gene dosage 
for all genes in the CNV region (Fig. 2), consistent with published 
partial results from adipose tissue’®. Mean relative transcript levels 
in deletion and duplication carriers were, respectively, 67% and 
214% of the levels measured in controls (Supplementary Table 13). 
Although genes proximal (centromeric) to the rearrangement interval 
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showed no significant variation in relative transcript levels between 
patients and controls (Fig. 2), distal (telomeric) genes showed a 
marked alteration in relative expression. However, their expression 
levels, including that of SH2B1 (for which gene dosage and a nearby 
single nucleotide polymorphism (SNP) have been associated with 
obesity’**°), were similarly upregulated in cell lines of both deletion 
and duplication carriers, showing no apparent correlation between 
transcript level and either copy number or phenotype (Fig. 2). 
Although lymphoblastoid cells may not recapitulate obesity-relevant 
tissues, previous experiments have shown a high degree of correlation 
between expression levels in different tissues and cell lines”, indicating 
that the same pathways may be similarly disrupted in different cell 
lineages. Thus, any involvement of these distal genes in the control of 
BMI in these subjects seems unlikely. 

Our study demonstrates the power of very large screens (>95,000 
samples: to our knowledge the largest of its kind so far) to characterize 
the clinical and molecular correlates of a rare functional genomic vari- 
ant. We demonstrate unambiguously that carrying the 16p11.2 duplica- 
tion confers a high risk of being clinically underweight, and show that 
reciprocal changes in gene dosage at this locus result in several mirror 
phenotypes. As in the schizophrenia/autism'* and microcephaly/ 
macrocephaly”' dualisms, abnormal eating behaviours, such as hyper- 
phagia and anorexia, could represent opposite pathological manifesta- 
tions of a common energy-balance mechanism, although the precise 
relationships between these mirror phenotypes remain to be deter- 
mined. We speculate that head circumference (which correlates with 
brain volume”’), and thus neuronal circuitry, may affect cognitive func- 
tion and energy balance in patients with 16p11.2 rearrangements 
(possibly through eating behaviour). Consistent with this are previous 
reports that a subgroup of children with microcephaly show a con- 
comitant reduction in weight percentile’. Our findings also support 
the observation that severe overweight and underweight phenotypes 
correlate with lower cognitive functioning*’’. Thus, abnormal food 
intake may be a direct result of particular neurodevelopmental di- 
sorders. Although it is possible that the 16p11.2 region encodes distinct 
genes specific for each trait, a more parsimonious hypothesis is that 
these clinical manifestations of dysfunction of the central nervous sys- 
tem are all secondary to the disruption of a single neurodevelopmental 
step that is sensitive to gene dosage. Further resolution of this issue may 
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Figure 2 | Transcript levels for genes within and near to the 16p11.2 
rearrangements. a, Relative expression levels of 27 genes mapping to 16p11.2 
in deletion (m = 6) and duplication (n = 5) carriers (red and green, 
respectively), and in control cell lines (n = 10, blue). Grey lines denote the 
extent of the 16p11.2 CNV (29.5-30.1 megabases (Mb)). Complete lists of 
genes mapping within the rearranged interval, and of the quantitative PCR 
assays, are in Supplementary Tables 1 and 11, respectively. For the possible 
relevance of each of these genes to obesity/leanness and/or developmental 
delay/cognitive deficits, see ref. 10. b, Rank comparison (Kruskal-Wallis test) 


between the expression of 27 genes mapping to 16p11.2 in deletion and 
duplication carriers (red and green, respectively) and in control cell lines (blue). 
Genes are labelled as telomeric, centromeric or within the rearranged interval 
(CNV). Dots correspond to the mean group rank and bars indicate the 
comparison interval. Groups with non-overlapping intervals are significantly 
different (P-values were adjusted for multiple testing issues using a Bonferroni 
correction, where the number of tests is the number of pairwise comparisons; 
the resulting adjusted P-value was less than 0.05). 


6 OCTOBER 2011 | VOL 478 | NATURE | 99 


©2011 Macmillan Publishers Limited. All rights reserved 


LETTER 


require the identification of additional patients with rare atypical re- 
arrangements in this region. 


METHODS SUMMARY 


Underweight is defined in adults as BMI= 18.5. In individuals younger than 
18 years of age, it is defined as a Z-score = —2. 

Statistics. Two-tailed Fisher’s exact test was used to compare frequencies of the 
rearrangement in patients and controls. Z-scores were computed for all data using 
gender-, age- and geographically-matched reference populations. One-tailed 
Student's t-test was performed to test BMI, height, weight and head circumference 
in duplication carriers for Z-scores of less than zero. We used Kruskal-Wallis tests 
for differences in gene expression patterns. P-values were adjusted using a 
Bonferroni correction, considering the number of pairwise comparisons; the result- 
ing adjusted P-value was less than 0.05. The relative risk of being underweight was 
calculated as the ratio of the fraction of underweight individuals among duplication 
carriers versus our control group. 

Discovery of CNVs. Carriers of 16p11.2 duplication and deletion were identified 
through various procedures: (1) comparative genomic hybridization with Agilent 
44K, 60K, 105K, 180K, 244K arrays; (2) Illumina Human317, Human370, 
HumanHap550, Human610 and 1M BeadChips; (3) Affymetrix 6.0, 500K geno- 
typing arrays; (4) quantitative multiplex PCR of short fluorescent fragments 
(QMPSF); (5) fluorescent in situ hybridization (FISH); (6) MLPA. CNV analyses 
of GWAS data were carried out using cnvHap, a moving-window average-intensity 
procedure, a Gaussian mixture model, circular binary segmentation, QuantiSNP, 
PennCNV, BeadStudio GT module and Birdseed. At least two independent algo- 
rithms were used for each cohort. 

Expression analyses. Lymphoblastoid cell lines were established from carriers and 
controls. SYBR Green quantitative PCR was performed to assess relative expres- 
sion of genes. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Study cohorts. For the description of these cohorts, refer to Supplementary 
Information. 

CNV detection. Cases ascertained for intellectual disabilities and developmental 
delay were identified through standard medical diagnostic procedures. CNV ana- 
lyses of GWAS data were variously carried out using cnvHap”*'; a moving-window 
average-intensity procedure; a Gaussian mixture model (Valsesia et al., submitted); 
circular binary segmentation’; QuantiSNP**; PennCNV~; BeadStudio GT module 
(Illumina Inc.); and Birdseed*® (see below). At least two independent algorithms were 
used for each cohort. 

Patients referred for intellectual disabilities and developmental delay. All 
diagnostic procedures (CGH, quantitative PCR and/or quantitative multiplex 
PCR of short fluorescent fragments) were carried out according to the relevant 
guidelines of good clinical laboratory practice for the respective countries. All 
rearrangements in probands were confirmed by a second independent method 
and karyotyping was performed in all cases to exclude a complex rearrangement. 
Northern Finland 1966 birth cohort (NFBC). CNV calling has been previously 
described'®. In brief, data were normalized using Illumina BeadStudio, then GC 
effects on ratios were removed by regressing on GC and GC2, and wave effects 
were removed by fitting a Loess function*’. CNV analysis was done using 
cnvHap”'. All called 16p11.2 duplications were validated by direct analysis of 
log, ratios. Data for each probe were normalized by first subtracting the median 
value across all samples (so that the distribution of ratios for each probe was 
centred on zero), and then dividing by the variance across all samples (to correct 
for variation in the sensitivity of different probes to copy-number variation). All 
CNV calls were confirmed by MLPA. 

deCODE genetics. [lumina Human317, Human370, HumanHap550, Human610 
and 1M BeadChips were used for CNV analysis. BeadStudio (version 2.0) was used 
to call genotypes, normalize the signal intensity data and establish the log R ratio 
(LRR) and B allele frequency (BAF) at every SNP according to standard Illumina 
protocols. All samples passed a standard SNP-based quality control procedure 
with a SNP call rate greater than 0.97. PennCNV”’, a free, open-source tool, was 
used for detection of CNVs. The input data for PennCNV are LRR, a normalized 
measure of the total signal intensity for the two alleles of the SNP, and BAF, a 
normalized measure of the allelic intensity ratio of the two alleles. These values are 
derived with the help of control genotype clusters (HapMap samples), using the 
Illumina BeadStudio software. PennCNV employs a hidden Markov model to 
analyse the LRR and BAF values across the genome. CNV calls are made on the 
basis of the probability of a given copy state at the current marker, as well as on 
the probability of observing a copy-state change from the previous marker to the 
current one. PennCNV uses a built-in correction model for GC content”*. 
Cohorte Lausannoise (CoLaus). Data normalization and CNV calling have been 
previously described'®. Data normalization included allelic cross-talk calibration*”°, 
intensity summarization using robust median average, and correction for any PCR 
amplification bias. Wave effects were corrected by fitting a Loess function*”. CNV 
calling was done using a Gaussian mixture model (Valsesia et al., submitted) that fits 
four components (deletion, copy-neutral, one additional copy and two additional 
copies) to copy-number ratios. The final copy number at each probe location is 
determined as the expected (dosage) copy number. The method has been validated 
by comparing test data sets with results from the CNAT" and CBS**” algorithms, 
and by replicating a subset of CoLaus subjects on Illumina arrays. Only duplications 
found by both Gaussian mixture model and CBS were considered. 

Estonian genome center of the University of Tartu (EGCUT). Genotypes were 
called by BeadStudio software GT module v3.1 or GenomeStudio GT v1.6 
(Illumina Inc.). Values for LRR and BAF produced by BeadStudio were formatted 
for further CNV analysis and break-point mapping with Hidden-Markov-Model- 
based softwares QuantiSNP (ver.1.1)** and PennCNV™ or CNVPartition 2.4.4 
(Illumina Inc.). All analyses were carried out using the recommended settings, 
except changing EMiters to 25 and L to 1,000,000 in QuantiSNP. For PennCNV, 
the Estonian-population-specific SNP allele frequency data was used. All detected 
duplications were confirmed by quantitative PCR. 

Study of health in Pomerania (SHIP). Raw intensities were normalized using 
Affymetrix power tools (Affymetrix); CNV analysis was done using Birdseye from 
the Birdsuite software package*® and PennCNV*. PennCNV predictions with 
confidence scores less than 10 were removed. Birdsuite predictions were filtered 
as in ref. 15: CNVs were kept if their linkage disequilibrium (LOD) score was > 10, 
length >1kb, number of probes =5 and size per number of probes <10,000. 
Kooperative Gesundheitsforschung in der Region Augsburg (KORA) F3 and 
F4. Genotyping for KORA F3 was performed using the Affymetrix 500K array set, 
consisting of two chips (Styl and NspI). The KORA F4 samples were genotyped 
with the Affymetrix human SNP array 6.0. For both studies, genomic DNA from 
blood samples was used for analysis. Hybridization of genomic DNA was done in 
accordance with the manufacturer’s standard recommendations. Genotyping was 
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done in the Genome Analysis Centre of the Helmholtz Centre Munich. Genotypes 
were determined using BRLMM clustering algorithm (Affymetrix 500K array set) 
and Birdseed2 clustering algorithm (Affymetrix array 6.0). For quality control 
purposes, we applied a positive control and a negative control DNA every 48 sam- 
ples (KORA F3) or 96 samples (KORA F4). On the chip level, only subjects with 
overall genotyping efficiencies of at least 93% were included. In addition, the called 
gender had to agree with the gender in the KORA study database. After exclusions, 
1,644 individuals remained in KORA F3 and 1,814 in KORA F4 for further 
analysis. 

MLPA analysis. We used MLPA to determine changes in the copy number of a 
region of about 2 Mb on chromosome 16p11.2. Briefly, we designed, using hg18, 
nine probes within the targeted region, one control probe outside the rearranged 
region and seven control probes targeting unique position in the genome 
(Supplementary Table 10). Assays were performed with MRC-Holland reagents 
according to the manufacturer’s protocol’. The analysis of the amplification 
products was performed by capillary electrophoresis in the DNA Analyser 
3730XL and using the GeneMapper software v3.7 (Applied Biosystems). The 
calculations were performed independently for each experiment: we first normal- 
ized the MLPA data to minimize the amount of experimental variation, summing 
all signal values of each control probe for each sample, and then dividing each 
signal value of each sample by the sum. The normalized signal values were com- 
pared to signal values from all other samples in the same experiment, dividing the 
normalized signal values by the average calculated from all the samples in the same 
experiment. The product of this calculation is termed dosage quotient (DQ). A DQ 
value of less than 0.65 or more than 1.25 was considered as copy-number loss or 
gain, respectively, as previously described**”. 

Custom array-CGH for the short arm of chromosome 16. DNA samples were 
labelled with Cy3 and cohybridized to custom-made Nimblegen arrays with Cy5- 
labelled DNA from the CEPH cell line GM12042. These arrays contained 71,000 
probes spread across the short arm of chromosome 16 from 22.0 Mb to 32.7 Mb (at 
a median space of 45 bp between 27.5 Mb and 31.0 Mb), and 1,000 control probes 
situated in invariable regions of the X chromosome. DNA labelling, hybridization 
and washing were performed according to Nimblegen protocols. Scanning was 
performed using an Agilent G2565BA microarray scanner. Image processing, 
quality control and data extraction were performed using the Nimblescan software 
v.2.5. 

Defining underweight. Underweight was defined throughout the study as 
BMI = 18.5 kg perm’ in adults and < —2 s.d. in children*™“”“*. 

Weight, height, BMI and head circumference Z-scores as a function of age. For 
paediatric cases, weight, height, BMI and head circumference Z-scores were deter- 
mined for paediatric cases (0-18 years of age) using clinical growth charts specific 
to the country of origin. Children were ascertained from nine different countries. If 
charts were only available in percentiles, those measures were transformed into 
Z-scores using gender-, age- and geographically-matched reference populations 
(see Statistics). 

For the USA and Canada, data from the Center for Disease Control and 
National Center for Health Statistics (CDC/NCHS) were used to calculate 
Z-scores””. 

For the French paediatric population, we used French national growth 
charts*’*'. For the Swiss paediatric population, we used Swiss national growth 
charts”. For Dutch participants, Dutch national growth charts were used”. For 
Italian, German, Finnish and Austrian cases (n = 6), height, weight and BMI 
Z-scores were estimated using WHO growth charts™. 

To check for discrepancies generated by the use of different growth charts, 
height, weight and BMI Z-scores were recalculated using WHO growth charts 
for all cases under five years of age, regardless of origin (http://www.who.int/ 
childgrowth/standards/en/54). Z-scores obtained using the WHO data were not 
significantly different. These growth standards, developed by the World Health 
Organization multicentre growth reference study, describe normal child growth 
from birth to 5 years under optimal environmental conditions. These standards 
can be applied to all children everywhere, regardless of ethnicity, socioeconomic 
status and type of feeding®**. 

If necessary, percentile values were transformed to Z-scores by the inverse- 
normal density function. When growth charts were unavailable, we used reported 
LMS parameters (median (M), generalized coefficient of variation (S) and skew- 
ness (L)) to obtain Z-scores via the formula: 


(X/M)i -1 
Z-score = a ae ; 
In(X/M)/S,L=0 


in which X is the observed value. 
In adults (>18 years of age), we estimated LMS parameters when these were 
unavailable from the available sex-, age- and origin-matched Swiss (CoLaus), 
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Estonian or French control populations. For cases identified from population- 
based cohorts, Z-scores were directly inferred from the cohort. 

Gene expression. We established lymphoblastoid cell lines from deletion and 
duplication carriers, as well as from controls (Supplementary Table 12), by trans- 
forming peripheral blood mononuclear cells with Epstein-Barr virus. Patients and 
controls were enrolled after obtaining appropriate informed consent via the physi- 
cians in charge, and approval by the ethics committee of the University of Lausanne. 
More control cell lines were obtained from Coriell Institute for Medical Research 
(http://www.coriell.org/) (Supplementary Table 12). SYBR Green real-time quant- 
itative PCR (RT-PCR) was performed as previously described’*””. Briefly, 1 ug of 
total RNA from lymphoblastoid cell lines was converted to complementary DNA 
using Superscript VILO (Invitrogen) primed with a mixture of oligo(dT) and random 
hexamers. Oligos were designed using the PrimerExpress program (Applied 
Biosystems) with default parameters (Supplementary Table 11). Non-intron- 
spanning assays were tested for genomic contamination in standard + reverse 
transcriptase reactions. The amplification efficiency of each primer pair was tested 
in a cDNA dilution series, as previously described”. A full list of genes mapping in 
the rearranged interval, and exclusion criteria, are presented in Supplementary 
Table 1. All RT-PCR reactions were performed in a 10-1 final volume and tripli- 
cates per sample. The setup in a 384-well plate format was performed using a 
Freedom EVO robot (TECAN) and assays were run in an ABI 7900 sequence 
detection system (Applied Biosystems) with the following amplification condi- 
tions: 50 °C for 2 min, 95 °C for 10 min, and 45 cycles of 95 °C 15s, then 60 °C for 
1 min. A final incubation of 95 °C for 15 s followed by 60 °C for 15 s was carried out 
to establish a dissociation curve. Each plate included the appropriate normaliza- 
tion genes to control for any variability between plate runs. Raw threshold cycles 
(Ct) values were obtained using SDS2.4 (Applied Biosystems). To calculate the 
normalized relative expression ratio of individuals carrying the CNV and of 
controls, we used Biogazelle qBase Plus software” including geNorm®. This 
program identified appropriate normalization genes (EEFIA1, RPL13, GUSB 
and TBP) having a gene-stability measure of M = 0.25. We note that one gene, 
LAT, showed a very high expression profile in one of the duplication samples 
(DASYL, Supplementary Table 13), reaching a relative expression value of 27.3 
(s.e.m. = 1.37), compared to an average expression for other duplications of 1.89 
(s.e.m. = 0.51). We cannot exclude that this finding is genuine (and confirmed it in 
a second experiment), but it was removed from further analyses as an outlier to give 
a more accurate overview of expression profiles for these genes. 

In silico analysis was performed to check for brain, and specifically hypothalamus, 
expression of genes in the rearranged 16p11.2 interval (Supplementary Table 1). This 
was done using Allen Brain Atlas Resources, available from http://www.brain-map. 
org. 

Cases with major neurological signs. Major neurological signs were defined 
by moderate to severe hypotonia, hypertonia, ataxia, spasticity, hypereflexia, 
hyporeflexia and/or extra-pyramidal signs, and by the presence of epilepsy. 
Statistics. Student’s t-test: one-tailed t-tests were performed to test whether 
duplication carriers have Z-score values lower than zero for BMI, height and 
weight. We found this analysis more suitable than linear regression analysis, 
correcting for confounding factors such as sex and age, because these anthro- 
pometric traits have a highly nonlinear dependence on these factors, as can be 
observed in control populations. 

Kruskal-Wallis test: this was used to test differences in the gene expression 
pattern between deletion and duplication carriers and control individuals. 
Because expression values are not necessarily normally distributed, this test is more 
adequate than a classical one-way analysis of variance. To test pairwise differences, 
we computed the difference in mean group rank with its 95% confidence interval 
(as provided by the multcompare function in Matlab). Correction for multiple 
testing was done using a Bonferroni adjustment. 

Multiple testing: we determined false-discovery-rate-based thresholds for asso- 
ciation P-values for each phenotype, to correct for multiple testing. For each 
phenotype, we replaced the observed Z-scores with numbers randomly drawn 
from a standard normal distribution and performed the same f-tests for the same 
strata. The procedure was repeated 1,000 times. For various P-value thresholds, we 
asked how many tests would be declared significant for the null set on average 
(over the 1,000 random draws). The false discovery rate was estimated as the ratio 
of this number and the actual number obtained for the observed Z-scores. Thus, 
we controlled the dependence between nested tests. 

Relative risk: among adults, we defined underweight as a BMI <18.5 (WHO 
criteria). The estimated relative risk is the ratio of the fraction of underweight 
individuals among duplication carriers versus our control group. The standard 
error of log(relative risk) and its significance were calculated as previously 
described*’. In our control group (population-based cohorts), the frequency of 


being underweight is 1.9% (38 males and 148 females out of 9,470). Owing to the 
fact that being underweight decreases with age in the general population, we 
resampled our control group to ensure precise age-matching. 
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Genetic variants in novel pathways influence blood 
pressure and cardiovascular disease risk 


The International Consortium for Blood Pressure Genome- Wide Association Studies 


Blood pressure is a heritable trait’ influenced by several biological 
pathways and responsive to environmental stimuli. Over one 
billion people worldwide have hypertension (=140 mm Hg systolic 
blood pressure or =90 mm Hg diastolic blood pressure)”. Even 
small increments in blood pressure are associated with an 
increased risk of cardiovascular events’. This genome-wide asso- 
ciation study of systolic and diastolic blood pressure, which used 
a multi-stage design in 200,000 individuals of European descent, 
identified sixteen novel loci: six of these loci contain genes 
previously known or suspected to regulate blood pressure 
(GUCY1A3-GUCY1B3, NPR3-C5orf23, ADM, FURIN-FES, 
GOSR2, GNAS-EDN3); the other ten provide new clues to blood 
pressure physiology. A genetic risk score based on 29 genome- 
wide significant variants was associated with hypertension, left 
ventricular wall thickness, stroke and coronary artery disease, 
but not kidney disease or kidney function. We also observed asso- 
ciations with blood pressure in East Asian, South Asian and 
African ancestry individuals. Our findings provide new insights 
into the genetics and biology of blood pressure, and suggest 
potential novel therapeutic pathways for cardiovascular disease 
prevention. 

Genetic approaches have advanced the understanding of biological 
pathways underlying inter-individual variation in blood pressure. For 
example, studies of rare Mendelian blood pressure disorders have 
identified multiple defects in renal sodium handling pathways’. 
More recently two genome-wide association studies (GWAS), each 
of >25,000 individuals of European ancestry, identified 13 loci asso- 
ciated with systolic blood pressure (SBP), diastolic blood pressure 
(DBP) and hypertension®®. We now report results of a new meta- 
analysis of GWAS data that includes staged follow-up genotyping to 
identify additional blood pressure loci. 

Primary analyses evaluated associations between 2.5 million geno- 
typed or imputed single nucleotide polymorphisms (SNPs) and SBP 
and DBP in 69,395 individuals of European ancestry from 29 studies 
(Supplementary Materials sections 1-3 and Supplementary Tables 1 
and 2). Following GWAS meta-analysis, we conducted a three-stage 
validation experiment that made efficient use of available genotyping 
resources, to follow up top signals in up to 133,661 additional indivi- 
duals of European descent (Supplementary Fig. 1 and Supplementary 
Materials section 4). Twenty-nine independent SNPs at 28 loci were 
significantly associated with SBP, DBP, or both in the meta-analysis 
combining discovery and follow-up data (Fig. 1, Table 1, Supplemen- 
tary Figs 2, 3 and Supplementary Tables 3-5). All 29 SNPs attained 
association P< 5 X 10 ”, an order of magnitude beyond the standard 
genome-wide significance level for a single-stage experiment (Table 1). 

Sixteen of these 29 associations were novel (Table 1). Two associa- 
tions were near the FURIN and GOSR2 genes; prior targeted analyses 
of variants in these genes suggested they may be blood pressure loci’”®. 
At the CACNB2 locus we validated association for a previously 
reported® SNP, rs4373814, and detected a novel independent asso- 
ciation for rs1813353 (pairwise 7° = 0.015 in HapMap CEU). Of our 
13 previously reported associations”®, only the association at PLCD3 


was not supported by the current results (Supplementary Table 4). 
Some of the associations are in or near genes involved in pathways 
known to influence blood pressure (NPR3, GUCY1A3-—GUCY1B3, 
ADM, GNAS-EDN3, NPPA-NPPB and CYP17A1; Supplementary 
Fig. 4). Twenty-two of the 28 loci did not contain genes that were a 
priori strong biological candidates. 

As expected from prior blood pressure GWAS results, the effects of 
the novel variants on SBP and DBP were small (Fig. 1 and Table 1). For 
all variants, the observed directions of effects were concordant for SBP, 
DBP and hypertension (Fig. 1, Table 1 and Supplementary Fig. 3). 
Among the genes at the genome-wide significant loci, only CYP17A1, 
previously implicated in Mendelian congenital adrenal hyperplasia and 
hypertension, is known to harbour rare variants that have large effects 
on blood pressure’. 

We performed several analyses to identify potential causal alleles 
and mechanisms. First, we looked up the 29 genome-wide significant 
index SNPs and their close proxies (r” > 0.8) among cis-acting expres- 
sion SNP (eSNP) results from multiple tissues (Supplementary 
Materials section 5). For 13/29 index SNPs, we found an association 
between nearby eSNP variants and the expression levels of at least one 
gene transcript (10 *>P>10 *'; Supplementary Table 6). In five 
cases, the index blood pressure SNP and the best eSNP from a genome- 
wide survey were identical, highlighting potential mediators of the 
SNP-blood pressure associations. 

Second, because changes in protein sequence are a priori strong 
functional candidates, we sought non-synonymous coding SNPs that 
were in high linkage disequilibrium (7° > 0.8) with the 29 index SNPs. 
We identified such SNPs at eight loci (Table 1, Supplementary 
Materials section 6 and Supplementary Table 7). In addition we per- 
formed analyses testing for differences in genetic effect according to 
body mass index (BMI) or sex, and analyses of copy number variants, 
pathway enrichment and metabolomic data, but we did not find any 
statistically significant results (Supplementary Materials sections 7-9 
and Supplementary Tables 8-10). 

We evaluated whether the blood pressure variants we identified 
in individuals of European ancestry were associated with blood pressure 
in individuals of East Asian (N = 29,719), South Asian (N = 23,977) 
and African (N= 19,775) ancestries (Table 1 and Supplementary 
Tables 11-13). We found significant associations in individuals of 
East Asian ancestry for SNPs at nine loci and in individuals of South 
Asian ancestry for SNPs at six loci; some have been reported previously 
(Supplementary Tables 12 and 15). The lack of significant association 
for individual SNPs may reflect small sample sizes, differences in allele 
frequencies or linkage disequilibrium patterns, imprecise imputation 
for some ancestries using existing reference samples, or a genuinely 
different underlying genetic architecture. Because of limited power to 
detect effects of individual variants in the smaller non-European sam- 
ples, we created genetic risk scores for SBP and DBP incorporating all 29 
blood pressure variants weighted according to effect sizes observed 
in the European samples. In each non-European ancestry group, risk 
scores were strongly associated with SBP (P=1.1X 10 *° in East 
Asian, P=2.9X10 } in South Asian, P=9.8 X10 * in African 
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Figure 1| Genome-wide —log), P-value plots and effects for significant 
loci. a, b, Genome-wide —log)o P-value plots are shown for SBP (a) and DBP 
(b). SNPs within loci reaching genome-wide significance are labelled in red for 
SBP and blue for DBP (+2.5 Mb of lowest P value) and lowest P values in the 
initial genome-wide analysis as well as the results of analysis including 
validation data are labelled separately. The lowest P values in the initial GWAS 
are denoted with a X. The range of different sample sizes in the final meta- 


ancestry individuals) and DBP (P = 2.9 x 10 8, P=95x10 and 
P=5.3 X 10 °, respectively; Supplementary Table 13). 

We also created a genetic risk score to assess association of the 
variants in aggregate with hypertension and with clinical measures 
of hypertensive complications including left ventricular mass, left 
ventricular wall thickness, incident heart failure, incident and preval- 
ent stroke, prevalent coronary artery disease (CAD), kidney disease 
and measures of kidney function, using results from other GWAS 
consortia (Table 2, Supplementary Materials sections 10, 11 and 
Supplementary Table 14). The risk score was weighted using the aver- 
age of SBP and DBP effects for the 29 SNPs. In an independent sample 
of 23,294 women”, an increase of one standard deviation in the genetic 
risk score was associated with a 23% increase in the odds of hyperten- 
sion (95% confidence interval 19-28%; Table 2 and Supplementary 
Table 14). Among individuals in the top decile of the risk score, the 
prevalence of hypertension was 29% compared with 16% in the bottom 
decile (odds ratio 2.09, 95% confidence interval 1.86-2.36). Similar 
results were observed in an independent hypertension case-control 
sample (Table 2). In our study, individuals in the top compared to 
bottom quintiles of genetic risk score differed by 4.6 mm Hg SBP and 
3.0 mm Hg DBP, differences that approach population-averaged blood 
pressure treatment effects for a single antihypertensive agent!’. 
Epidemiological data have shown that differences in SBP and DBP 
of this magnitude, across the population range of blood pressure, 
are associated with an increase in cardiovascular disease risk’. 
Consistent with this and in line with findings from randomized trials 


104 | NATURE | VOL 478 | 6 OCTOBER 2011 


FLJ32810-TMEM133 


‘ATP2B1 
eo SH2B3 


] 
| 
PLEKHA7 
ADM | 
| TBX5-TBX3 


FLJ32810-TMEM133 


\ PLEKHA7\ atpopy 


SLC39A8 
ATP2B1 
GNAS-EDN3 
CYP17A1-NT5C2 
MTHFR-NPPB 

HFE 

C10orf107 

FGF5 
CYP1A1-ULK3 
CACNB2(3’) 

SH2B3 

FURIN-FES 
FLJ32810-TMEM133 
PLEKHA7 

ADM 

NPR3-C5orf23 

EBF1 

PLCE1 

BAT2-BAT5 

MOV10 

ZNF652 

TBX5-TBX3 
CACNB2(5’) 

JAG1 
GUCY1A3-GUCY1B3 
MECOM 

SLC4A7 

GOSR2 

ULK4 


CYP1A1—-ULK3 GNAS-EDN3 
4 


= FURIN-FES 


ZNF652 


GOSR2 


JAG1 
14 15 16 17 1819 202122 


ee 
12 48. 


CYP1A1-ULK3 
SH2B3 
10) 


GNAS-EDN3 
a 


FURIN-FES 
/ 
ZNF652 


JAG1 


TBXS-TBX3 


@ =» GosR2 . 

“ is | bei bY 

a | | | 
= = os 6 


4 = = 0.0 
12 13 14 15 16 17 18192021 22 


analysis including the validation data are indicated as: circle (96,000-140,000), 
triangle (>140,000-180,000) and diamond (>180,000-220,000). SNPs near 
unconfirmed loci are in black. The horizontal dotted line is P= 2.5 X 10°. 
GUCY denotes GUCY1A3-—GUCY1B3. ¢, Effect size estimates and 95% 
confidence bars per blood-pressure-increasing allele of the 29 significant 
variants for SBP (red) and DBP (blue). Effect sizes are expressed in mm Hg per 
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of blood-pressure-lowering medication in hypertensive patients'*”’, 
the genetic risk score was positively associated with left ventri- 
cular wall thickness (P=6.0X10 °), occurrence of stroke 
(P = 3.3 X 10°) and CAD (P= 8.1 X 10°”). The same genetic risk 
score was not, however, significantly associated with chronic kidney 
disease or measures of kidney function, even though these renal out- 
comes were available in a similar sample size as for the other outcomes 
(Table 2). The absence of association with kidney phenotypes could be 
explained by a weaker causal relationship between blood pressure and 
kidney phenotypes than with CAD and stroke. This finding is consist- 
ent with the mismatch between observational data that show a positive 
association of blood pressure with kidney disease, and clinical trial data 
that show inconsistent evidence of a benefit from blood pressure low- 
ering on kidney disease prevention in patients with hypertension”. 
Thus, several lines of evidence converge to indicate that blood pressure 
elevation may in part be a consequence rather than a cause of sub- 
clinical kidney disease. 

Our discovery meta-analysis (Supplementary Fig. 2) suggests an 
excess of modestly significant (10. * < P< 10° *) associations probably 
arising from common blood pressure variants of small effect. By divid- 
ing our principal GWAS data set into non-overlapping discovery 
(N ~ 56,000) and validation (N ~ 14,000) subsets, we found robust 
evidence for the existence of such undetected common variants 
(Supplementary Fig. 5 and Supplementary Materials section 12). We 
estimate’® that there are 116 (95% confidence interval 57-174) inde- 
pendent blood pressure variants with effect sizes similar to those 
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Table 1 | Summary association results for 29 blood pressure SNPs 
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Locus Index SNP Chr Position CA/ CAF nsSNP eSNP SBP DBP HTN 
NCA 
Beta P value Effect in Beta P value Effect in Beta P value 
EA/SA/A EA/SA/A 

MOV10 1$2932538 1 113,018,066 G/A 0.75 Y(p) Y(p) 0388 1.2x10°2 +/+/— 0.240 99x10710 +/+*/— 0.049 2.9 x 1077 
SLC4A7 rs13082711 3 27,512,913 T/C 0.78 Y(p) Y(p) -0.315 1.5x10© -/-/+ -0.238 3.8x10°° -/-/+ -0.035 3.6x10+ 
MECOM rs419076 3 170,583,580 T/C O47 - - 0.409 18x10°!% +/+/+ 0.241 2.110712 +/+/- 0.031 3.1x10-% 
SLC39A8 rs13107325 4 103,407,732 T/C 0.05 Y Y(+) -0.981 33x10 2?/+/+ -0.684 2.3x10°'? 2/+/+ -0.105 4.9x10°’ 
GUCYIA3-—-rs13139571 4 156,864,963 C/A 0.76 - - 0.321 12x10 +/-/+ 0.260 2.2x107!° +/-/+ 0.042 2.5x10~° 
GUCY1B3 
NPR3- rs1173771 5 32,850,785 G/A 0.60 - - 0.504 18x1071© +*/+/+ 0.261 9.1x107!2 +%*/+/- 0.062 3.2 x1071° 
Cd5orf23 
EBF1 rs11953630 5 157,777,980 T/C 037 - - -0.412 30x10! +/+/+ -0.281 3810713 4+/+/ 0.052 1.7x10°’ 
HFE rs1799945 6 26,199,158 G/C 014 Y - 0.627 7.7X10°1% +/+/- 0457 1.5x107!5 +/+/- 0,095 1.8x1071° 
BAT2-BAT5 —_rs805303 6 31,724,345 G/A 0.61 Y(p) Y(+) 0.376 15x101! -/-/? 0.228 3.0x1071! -/-/+ 0.054 1.1 x 107!° 
CACNB2(5') rs4373814 10 18,459,978 G/C 055 - - -0.373 48x107! 4+/+/-— -0.218 44x107!° -/+/- -0.046 835x108 
PLCE1 rs932764 10 95,885,930 G/A 044 - - 0.484 7.1x10'© +/+/— 0.185 81x10°7 +/+/— 0.055 94x10°° 
ADM rs7129220 11 10,307,114 G/A 089 - - -0.619 30x10! 2/-/+ -0.299 64x10°8 ?/-/+ -0.044 11x10% 
FLJ32810- —_rs633185 11 100,098,748 G/C 0.28 - - -0.565 1.2x10°'7 +#*/+/+ —0.328 2.0x10°19 +#/+/ 0.070 5.4x1071! 
TMEM133 
FURIN-FES _s2521501 15 89,238,392 T/A 031 - Y(-) 0650 5.2x10°19 +*/+/+ 0.359 1910715 +*/+/4 0.059 7.0x10°” 
GOSR2 rs17608766 17 42,368,270 T/C 0.86 - Y(+) -0.556 1.110719 +/-/+ -0.129 0.017 +/-/+ —0.025 0.08 
JAG1 rs1327235 20 0,917,030 G/A 046 - - 0.340 19x10 +*/+/+ 0.302 14x1071!5 +*/+*/+ 0.034 46x10~* 
GNAS-EDN3 rs6015450 20 57,184,512 G/A 0.12 Y(p) - 0.896 3.9x10°73 2/+/+ 0.557 5.6 x 10°23 2/+*/+ 0.110 4.2 x 1074 
MTHFR- rs17367504 1 1,785,365 G/A 0.15 - Y(-/r) -0.903 8.7 x10-%? +/+/+ -0.547 3.5x10°!9 +/+/ 0.103 2.3 x10°1° 
NPPB 
ULK4 183774372 3 41,852,418 T/C 083 Y Y(r/p) -0.067 0.39 —/-/+ —0.367 9.0x10°'4 4+/+/ 0.017 0.18 
FGF5 rs1458038 4 81,383,747 T/C 0.29 - - 0.706 1.5107 +*/+/+ 0.457 8.510725 +#/+*/+ 0.072 1.9x107’ 
CACNB2(3')  rs1813353 10 8,747,454 T/C 0.68 - - 0.569 26x10! +/+/+ 0415 2.3x107!9 +/+/ 0.078 6.2 x 1071° 
Cl10orf107 ~—srs4590817 10 63,137,559 G/C 084 - Yr) 0.646 4.0x10712 -/+/— 0419 1.3x107!2 -/-/— 0.096 98x10 
CYPI7A1-_rs11191548 10 104,836,168 T/C 0.91 - Y(-) 1.095 6.9x10~° +*/+*/+ 0.464 9.4x107!5 +*/+*/+ 0.097 14x10~° 
NT5C2 
PLEKHA7 rs381815 1 6,858,844 T/C 0.26 - - 0.575 53x10! +*/+/+ 0.348 5.3x10710 +*/-/+ 0.062 34x10°° 
ATP2B1 1817249754 12 88,584,717 G/A 084 - - 0.928 1.8x107!8 +#/+*/— 0.522 1.21074 +*/+*/ 0.126 1.1x10°'4 
SH2B3 rs3 184504 2 110368991 T/C 047 Y Y4+) 0598 38x10 1!8 -/-/+ 0448 3.6x10°%5 -/-/+ 0.056 2.6x10 6 
TBX5-TBX3 —-rs10850411 12 113,872,179 T/C 0.7 - - 0.354 54x10°8 -/+/- 0.253 54x1071° -/-/- 0.045 5.2x10°° 
CYPIAI- _—_rs1378942 15 72,864,420 C/A 035 - Y(+) 0613 5.7x10°73 +*/+/+ 0416 2.7x10°°° +*/+/- 0.073 1.0x10°® 
ULK3 
ZNF652 rs12940887 17 44,757,806 T/C 0.38 - Y(-) 0.362 18x10°1° +/-/+ 0.27 2.3107 +/-/+ 0.046 1.2x10~’ 
Summary association statistics, based on combined discovery and follow-up data, for 29 independent SNPs in individuals of European ancestry are shown. New genome-wide significant findings (17 SNPs) are 
presented in the top half of the table, data on 12 previously published signals are presented in the lower half. Y indicates that the blood pressure index SNP is anon-synonymous (ns)SNP, Y(p) indicates a proxy SNP 
is ansSNP. Y(+) indicates that the blood pressure index SNP is the strongest known eSNP for a transcript; Y(—) indicates that the blood pressure index SNP is an eSNP but not the strongest known eSNP for any 
transcript. Y(r) indicates that the blood pressure index SNP is the strongest known eSNP ina targeted real-time PCR experiment. Y(p) indicates that a proxy SNP (r? > 0.8) toa blood pressure SNP is an eSNP but not 


the strongest known eSNP. Observed effect directions in East Asian (EA), South Asian (SA) and African (A) ancestry individuals are coded + or — if concordant or discordant with directions in European ancestry 


results. Effect size estimates (beta) correspond to mm Hg per coded allele for SBP and DBP and In(odds) per coded allele for 


allele. ? denotes missing data. Genomic positions use NCBI Build 36 coordinates. 
* Significant, controlling the FDR at 5% over 58 tests per ancestry (Supplementary Tables 5 and 12). 


reported here, which collectively can explain ~2.2% of the phenotypic 
variance for SBP and DBP, compared with 0.9% explained by the 29 
associations discovered thus far (Supplementary Fig. 6 and Sup- 
plementary Materials section 13). 

Most of the 28 blood pressure loci harbour multiple genes 
(Supplementary Table 15 and Supplementary Fig. 4), and although 
substantial research is required to identify the specific genes and var- 
iants responsible for these associations, several loci contain highly 
plausible biological candidates. The NPPA and NPPB genes at the 
MTHFR-NPPB locus encode precursors for atrial- and B-type 
natriuretic peptides (ANP, BNP), and previous work has identified 
SNPs—modestly correlated with our index SNP at this locus—which 
are associated with plasma ANP, BNP and blood pressure’’. We found 
the index SNP at this locus was associated with opposite effects on 
blood pressure and on ANP/BNP levels, consistent with a model in 
which the variants act through increased ANP/BNP production to 
lower blood pressure’® (Supplementary Materials section 14). 

Two other loci identified in the current study harbour genes 
involved in natriuretic peptide and related nitric oxide signalling path- 
ways'”", both of which act to regulate cyclic guanosine monopho- 
sphate. The first locus contains NPR3, which encodes the natriuretic 
peptide clearance receptor (NPR-C). NPR3 knockout mice exhibit 
reduced clearance of circulating natriuretic peptides and lower blood 
pressure’. The second locus includes GUCYIA3 and GUCY1B3, 
encoding the « and f subunits of soluble guanylate cyclase; knockout 
of either gene in murine models results in hypertension”. 


ypertension (HTN). CA, coded allele; CAF, coded allele frequency; NCA, non-coded 


Another locus contains ADM—encoding adrenomedullin—which 
has natriuretic, vasodilatory and blood-pressure-lowering properties”. 
At the GNAS-EDN3 locus, ZNF831 is closest to the index SNP, but 
GNAS and EDN3 are two nearby compelling biological candidates 
(Supplementary Fig. 4 and Supplementary Table 15). 

We identified two loci with plausible connections to blood pressure 
via genes implicated in renal physiology or kidney disease. At the first 
locus, SLC4A7 is an electro-neutral sodium bicarbonate co-transporter 
expressed in the nephron and in vascular smooth muscle”. At 
the second locus, PLCEI1 (phospholipase-C-epsilon-1 isoform) is 
important for normal podocyte development in the glomerulus; 
sequence variation in PLCE1 has been implicated in familial nephrotic 
syndromes and end-stage kidney disease”. 

Missense variants in two genes involved in metal ion transport were 
associated with blood pressure in our study. The first encodes a His/ 
Asp change at amino acid 63 (H63D) in HFE and is a low-penetrance 
allele for hereditary hemochromatosis™. The second is an Ala/Thr 
polymorphism located in exon 7 of SLC39A8, which encodes a zinc 
transporter that also transports cadmium and manganese**. The same 
allele of SLC39A8 associated with blood pressure in our study has 
recently been associated with high-density lipoprotein cholesterol 
levels*® and BMI” (Supplementary Table 15). 

We have shown that 29 independent genetic variants influence 
blood pressure in people of European ancestry. The variants reside 
in 28 loci, 16 of which were novel, and we confirmed association of 
several of them in individuals of non-European ancestry. A risk score 
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Table 2 | Genetic risk score and cardiovascular outcome association results 


Phenotype Source Effect se. P value No. SNPs Contrast top versus bottom N case/control 
or total 

(per s.d. of genetic risk score) Quintiles  Deciles 
Blood pressure phenotypes 
SBP (mm Hg) WGHS 1.645 0.098 (a) 65x10°° 29 4.61 5.77 (a) 23,294 
DBP (mm Hg) WGHS 1.057 0.067 (a) 84x10-° 29 2.96 3.71 (a) 23,294 
Prevalent hypertension WGHS 0.211 0.018 (b) 3.1x10°%% 29 80 2.09 (b) 5,018/18,276 
Prevalent hypertension BRIGHT 0.287 0.031 (b) 7.7x10~73 29 2.23 2.74 (b)  2,406/1,990 
Dichotomous endpoints 
Incident heart failure CHARGE-HF 0.035 0.021 (c) 0.10 29 10 13° (©) 2,526/18,400 
Incident stroke NEURO-CHARGE 0.103 0.028 (c) 0.0002 28 34 44 (c) 1,544/18,058 
Prevalent stroke SCG 0.075 0.037 (b) 0.05 29 23 30 8 6(b) = 1,473/1,482 
Stroke (combined, incident and prevalent) CHARGE & SCG NA NA NA 3.3x10°5 NA NA NA NA 3,017/19,540 
Prevalent CAD CARDIoGRAM 0.092 0.010 (b) 16x10! 28 .29 38 = (b) 22,233/64,726 
Prevalent CAD C4D ProCARDIS 0.132 0.022 (b) 2.2x10-° 29 45 59 = (b) =5,720/4,381 
Prevalent CAD C4D HPS 0.083. 0.027 (b) 0.002 29 26 34 8 (b) = 2,704/2,804 
Prevalent CAD (combined) CARDIOGRAM & C4D 0.100 0.009 (b) 81x10? 29 32 42 (b) 30,657/71,911 
Prevalent chronic kidney disease CKDGen 0.014 0.0015 (b) 035 29 04 05 (b) 5,807/61,286 
Prevalent microalbuminuria CKDGen 0.008 0.019 (b) 0.68 29 02 03 (b) 3,698/27,882 
Continuous measures of target organ damage 
Left ventricular mass (g) EchoGen 0.822 0317 (a O01 29 2.30 2.89 (a) 12,612 
Left ventricular wall thickness (cm) EchoGen 0.009 0.002 (a) 60x10 ° 29 0.03 0.03 (a) 12,612 
Serum creatinine KidneyGen —0.001 0.001 (d) 0.24 29 1.00 1.00 (d) 23,812 
eGFR (four-parameter MDRD equation) CKDGen —0.0001 0.0009 (d) 0.93 29 1.00 1.00  (d) 67,093 
Urinary albumin/creatinine ratio CKDGen 0.005 0.007 (d) 043 29 LO 1.02 (d) 31,580 


Association of genetic risk score (using all 29 SNPs at 28 loci, parameterized using the average of SBP and DBP effects (= (SBP effect + DBP effect)/2) from the discovery analysis), tested in results from other 
GWAS consortia. (a) Units are the unit of phenotypic measurement, either per standard deviation (s.d.) of genetic risk score, or as a difference between top/bottom quintiles or deciles. (b) Units are In(odds) per s.d. 
of genetic risk score, or odds ratio between top/bottom quintiles or deciles. (c) Units are In(hazard) per s.d. of genetic risk score, or hazard ratio between top/bottom quintiles or deciles. (d) Units are In(phenotype) 
per s.d. of genetic risk score, or phenotypic ratio between top/bottom quintiles or deciles. s.e., standard error. SCG, UK-US Stroke Collaborative Group; see Supplementary Materials sections 1.79 and 11 for further 


detail on consortia and studies. 


derived from the 29 variants was significantly associated with blood- 
pressure-related organ damage and clinical cardiovascular disease, but 
not kidney disease. These loci improve our understanding of the gen- 
etic architecture of blood pressure, provide new biological insights into 
blood pressure control and may identify novel targets for the treatment 
of hypertension and the prevention of cardiovascular disease. 

Note added in proof: Since this manuscript was submitted, Kato et al. 
published a blood pressure GWAS in East Asians that identified a SNP 
highly correlated to the SNP we report at the NPR3/CS5orf23 locus”. 


METHODS SUMMARY 


Supplementary Materials provide complete methods and include the following 
sections: study recruitment and phenotyping, adjustment for antihypertensive 
medications, genotyping, data quality control, genotype imputation, within- 
cohort association analyses, meta-analyses of discovery and validation stages, 
stratified analyses by sex and BMI, identification of eSNPs and non-synonymous 
SNPs, metabolomic and lipidomic analyses, CNV analyses, pathway analyses, 
analyses for non-European ancestries, association of a risk score with hypertension 
and cardiovascular disease, estimation of numbers of undiscovered variants, mea- 
surement of natriuretic peptides, and brief literature reviews and GWAS database 
lookups of all validated blood pressure loci. Full GWAS results for ~2.5 million 
SNPs are also provided. 
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Dynamics of human adipose lipid turnover in health 


and metabolic disease 
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Mats Eriksson’, Erik Arner’, Hans Hauner®, Thomas Skurk®, Mikael Ryden’, Keith N. Frayn’ & Kirsty L. Spalding® 


Adipose tissue mass is determined by the storage and removal of 
triglycerides in adipocytes’. Little is known, however, about adipose 
lipid turnover in humans in health and pathology. To study this in 
vivo, here we determined lipid age by measuring '*C derived from 
above ground nuclear bomb tests in adipocyte lipids. We report 
that during the average ten-year lifespan of human adipocytes, 
triglycerides are renewed six times. Lipid age is independent of 
adipocyte size, is very stable across a wide range of adult ages and 
does not differ between genders. Adipocyte lipid turnover, however, 
is strongly related to conditions with disturbed lipid metabolism. In 
obesity, triglyceride removal rate (lipolysis followed by oxidation) is 
decreased and the amount of triglycerides stored each year is 
increased. In contrast, both lipid removal and storage rates are 
decreased in non-obese patients diagnosed with the most common 
hereditary form of dyslipidaemia, familial combined hyperlipidae- 
mia. Lipid removal rate is positively correlated with the capacity of 
adipocytes to break down triglycerides, as assessed through lipoly- 
sis, and is inversely related to insulin resistance. Our data support 
a mechanism in which adipocyte lipid storage and removal have 
different roles in health and pathology. High storage but low trigly- 
ceride removal promotes fat tissue accumulation and obesity. 
Reduction of both triglyceride storage and removal decreases lipid 
shunting through adipose tissue and thus promotes dyslipidaemia. 
We identify adipocyte lipid turnover as a novel target for prevention 
and treatment of metabolic disease. 

A major function of adipose tissue is to store and release fatty acids, 
which are incorporated into adipocyte triglycerides according to 
whole-body energy demands. Body fat mass is determined by the 
balance between triglyceride storage and removal in adipocytes, by 
either enzymatic hydrolysis (lipolysis) and subsequent fatty acid oxida- 
tion and/or ectopic deposition in non-adipose tissues. Little is known 
about the dynamics of these processes in humans. Although isotope 
tracer methods have been used to estimate lipid turnover in human 
adipose tissue, these studies have been limited to short-term experi- 
mental conditions'*. To study long-term adipose tissue lipid turnover 
in vivo and across the adult lifespan, we developed a method to retro- 
spectively determine the age of adipocyte triglycerides in humans. 
Triglycerides are the major component of the adipocyte lipid droplet. 
Lipid age was assessed by measuring the '“C content in the lipid com- 
partment of adipocytes from human subcutaneous adipose tissue, the 
major fat depot in humans. 'C levels in the atmosphere remained 
remarkably stable until above ground nuclear bomb tests between 
approximately 1955 and 1963 caused a significant increase in ‘4C 
relative to stable carbon isotope levels’ (Fig. 1a). After the Limited 
Nuclear Test Ban Treaty was signed in 1963, '*C levels in the atmo- 
sphere decreased exponentially. This is not due to radioactive decay 
(half-life (T,)2) for “C is 5,730 years), but to diffusion of CO, out of 
the atmosphere’. '*C in the atmosphere oxidises to form CO3, which is 


taken up in the biotope by photosynthesis. Because we eat plants, or 
animals that live off plants, the '*C content in the atmosphere is 
directly mirrored in the human body. 

Radiocarbon dating has been used to study the incorporation of 
atmospheric ‘“C into DNA to determine the age of different human 
cell types, including adipocytes’. Here, we compared the incorpora- 
tion of '“C into adipocyte triglycerides with the dynamic changes in 
atmospheric ‘*C described earlier. Triglyceride age was determined by 
using a linear lipid replacement model in which the age distribution of 
lipids within an individual was exponentially distributed correspond- 
ing to a constant turnover rate (per year)'*. The associated mean age, 
termed lipid age, is the inverse of the turnover rate and reflects the 
irreversible removal of lipids from adipose stores (Supplementary 
Information 1 and Fig. 1 of Supplementary Information 1). 

Earlier studies indicate that triglycerides in adipose tissue form two 
distinct pools with high or low turnover rates, respectively’*"*. Our 
data, obtained from individuals born before, during and after bomb 
testing, do not support the hypothesis of dual large lipid pools with 
different half-lives (Fig. 1b). ‘“C data were modelled according to one 
or more pools of lipids with different lipid removal rates (Supplemen- 
tary Information 1). The existence of a very small pool of younger 
lipids cannot be excluded based on data modelling (Supplementary 
Information 1 and Fig. 2 of Supplementary Information 1). According 
to a two-pool model the influence on the turnover rate is proportional 
to the fraction of lipid in the small pool. Triglyceride exchange between 
adipocytes and other small storage pools can affect turnover estimates. 
The two-pools model shows, however, that the non-adipose pool can 
be neglected when it makes up less than 20% of the lipids (Supplemen- 
tary Information 1, Fig. 3). Small pools with high turnover are more 
important for short-term (days or weeks) than long-term (years) 
triglyceride turnover. 

Mean lipid age was 1.6 years (Fig. 1c), which is in the same range as 
in short-term turnover studies*. The distribution of lipid age was 
compared with that of adipocyte age reported previously in a com- 
parable cohort’. The mean age of adipocytes was 9.5 years (Fig. 1d). 
This implies that triglycerides, on average, are replaced six times 
during the lifespan of the adipocyte, enabling a dynamic regulation 
of lipid storage and mobilization over time. 

There is a large variation in adipocyte size within and between indi- 
viduals (Supplementary Information 2, Supplementary Table 1)”. 
However, it is unlikely that the rate of triglyceride removal from 
adipocytes is important for these variations, as lipid age was not related 
to adipocyte size when set in relation to the body fat mass (Fig. 2a, b), 
nor was there a difference in lipid age between large and small adipo- 
cytes of the same adipose tissue sample (Fig. 2c, d). These data indicate 
that there is a continuous exchange of lipids between adipocytes within 
the adipose tissue that is not dependent on adipocyte size. Fatty acids 
produced by lipolysis in one adipocyte could, for example, be taken up 
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Figure 1 | Atmospheric ‘4C over time and its use to determine lipid age and 
adipocyte age. a, Above ground nuclear bomb testing during the period of the 
cold war caused an increase in atmospheric levels of C. These values decreased 
exponentially following implementation ofa limited world-wide test ban treaty 
in 1963 (blue curve). Lipid age is determined by measuring ‘“C levels in lipids 
(1) and plotting this value against the bomb curve (2) to determine the 
difference between the year corresponding to the atmospheric '4C 
concentration (3) and the biopsy collection date (dashed line). Atmospheric =C 
levels are presented as ‘*C/'°C ratios in units of fraction modern (for a 
definition of ‘modern’ see Supplementary Information 2). b, Lipid age and 
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Figure 2 | Relationship between adipocyte size and lipid age. a, b, Influence 
of adipocyte cellularity on lipid age. Individuals were assigned a morphometric 
value, which is the difference between the measured adipocyte volume for the 
individual minus the average adipocyte volume for all subjects (see 
Supplementary Information 2). This analysis was carried out across the full 
range of body masses. Positive values indicate larger adipocytes than expected 
(fewer but larger adipocytes = hypertrophy). Negative values indicate smaller 
adipocytes than expected (many but smaller adipocytes = hyperplasia). 

a, Individual values compared by linear regression analysis (n = 74). b, Data 
(mean + standard error) with morphology as a dichotomous variable (n = 36 
for hyperplasia and n = 38 for hypertrophy). An unpaired t-test was used. 

c, d, Isolated subcutaneous adipocytes were fractioned into small (fraction 1) or 
very large (fraction 4) samples (n = 7). Adipocyte volume (c) and adipose lipid 
age (d) were compared. Values are mean + standard error. A paired t-test was 
used. Data ina and b are from non-obese plus obese individuals in cohort 1 and 
data in c and d are from cohort 2. 
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turnover do not change as a function of person age. Lipid age is shown for three 
individuals born in 1940.2, 1959.9 and 1967.9. Lipid age was shown to be the 
same for all individuals, despite markedly different subject ages. Fat biopsies 
were collected from all individuals on the same date (dashed vertical line). The 
solid vertical lines indicate the date of birth. The small dashed lines show the 
‘C lipid value for each individual. c, Distribution of values for lipid age in 
healthy non-obese or obese individuals from cohort 1 (n = 78). d, The 
distribution of values for human adipocyte age (n = 27). Adipocyte age data are 
obtained from a previous publication (see main text). 


by adjacent adipocytes and incorporated into their triglycerides. These 
processes would not be part of lipid removal as measured here. 

Lipid age and total fat mass data were used to determine the net 
triglyceride storage in adipose tissue (kg year ') (see Supplementary 
Information 1). The net amount of lipid stored in adipose tissue each 
year is the sum of exogenous fat incorporation and endogenous syn- 
thesis, minus lipid removal. The removal rate represents the hydrolysis 
of triglycerides (lipolysis) followed by the irreversible removal of lipids 
by oxidation. A high lipid age therefore mirrors low removal rates. No 
relationship between lipid storage or removal and person age or gender 
was seen (Supplementary Information 2 and Fig. la-d of Supplemen- 
tary Information 2). 

Two clinical conditions where altered lipid metabolism is observed 
were investigated—obesity and familial combined hyperlipidaemia 
(FCHL); the latter is the most common hereditary lipid disorder 
(reviewed in ref. 16). It has an unknown aetiology and is a common 
hereditary cause of premature coronary heart disease. Adipocyte lipo- 
lysis is impaired in both conditions due to decreased cyclic AMP- 
dependent signalling, the major lipolytic pathway in adipocytes’. 
Both conditions show a similar metabolic phenotype (mixed dyslipi- 
daemia, elevated apolipoprotein B and insulin resistance)*’. These 
clinical characteristics are confirmed in our study cohort (Supplemen- 
tary Information 2, Supplementary Table 1). FCHL individuals may 
present with a range of body fat levels; however, for our analyses only 
non-obese FCHL patients were selected so as to remove the confound- 
ing factor of obesity from the study. 

In obese subjects, the rate of triglyceride storage (Fig. 3a) and mean 
lipid age (Fig. 3b) were markedly increased compared to non-obese 
individuals. Both lipid age (r= 0.38, P= 0.0005) and triglyceride 
storage (r = 0.60, P< 0.0001) correlated with body mass index (BMI) 
when non-obese and obese individual were pooled together. Similarly, 
in non-obese FCHL individuals lipid age was increased to values 
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Figure 3 | Lipid turnover in subcutaneous fat. a, b, Lipid storage (a) and lipid 
age (b) were determined in 48 non-obese, 30 obese and 13 non-obese FCHL 
subjects. Error bars indicate standard error. Overall effect is P< 0.0001 by 
analysis of variance (ANOVA) ina and b. Results in graphs are from post-hoc 
test. Data are from cohort 1 (see Supplementary Information 2). A linear 
regression analysis was performed on all individuals from cohort 1 having 


observed in obesity (Fig. 3b). In contrast to obesity, however, the rate of 
triglyceride storage was markedly decreased compared to non-obese 
individuals (Fig. 3a). Thus, adipocyte triglyceride turnover is not just a 
mere reflection of the fat mass. Our data indicate a model where a 
combination of high storage and low lipid removal rates, as in obesity, 
facilitates triglyceride accumulation within adipose tissue, thereby pro- 
moting the development and/or maintenance of excess body fat mass. 
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insulin resistance measures (n = 82). c, d, HOMA-IR was correlated with lipid 
age (c) and lipid storage (d). The relationship between lipid age and HOMA-IR 
remained significant when BMI, gender or group (non-obese, obese, FCHL) 
were included in the analysis (partial r = 0.41, P = 0.006 with BMI using 
multiple regression analysis and F = 16.6, P = 0.0001 and F = 4.8, P = 0.03 for 
gender or group, respectively, using analysis of covariance (ANCOVA)). 


Conversely, a low rate of both triglyceride storage and removal, as in 
FCHL, leads to reduced triglyceride turnover and thereby a decreased 
ability of adipocytes to store and release fatty acids, despite a normal 
body fat mass. As discussed in detail elsewhere*'”’, low lipid turnover in 
adipose tissue may result in fatty acids being shunted to the liver, which 
drives the synthesis of apolipoprotein B and increases the circulating 
levels of triglycerides. Adipocyte triglyceride turnover may also be 
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involved in determining overall insulin effects. Insulin resistance 
(indirectly measured by the HOMA-IR index, see Supplementary 
Information 2) and lipid turnover were assessed in 82 individuals. 
Triglyceride age was strongly related to levels of insulin resistance 
(Fig. 3c), although there was no relationship between triglyceride storage 
and insulin resistance (Fig. 3d). There was no significant interaction 
between groups (lean, obese and non-obese FCHL) as determined by 
analysis of co-variance, indicating that the rate of triglyceride removal 
from adipocytes has an impact on whole-body insulin sensitivity inde- 
pendent of any underlying disorder. Multiple regression analysis 
showed that the relationship between HOMA-IR and lipid removal 
was not influenced by plasma triglycerides (partial r = 0.35; P = 0.007). 

We also examined non-obese and obese individuals separately 
(Supplementary Information 2 and Figure 2a-d of Supplementary 
Information 2). Variations in BMI were significantly related to lipid 
age only among non-obese and to lipid storage only among obese 
individuals. HOMA-IR variations were significantly related to lipid 
storage when obese subjects were removed from the analysis (no rela- 
tionship was found among obese subjects themselves). Thus, varia- 
tions in triglyceride turnover may have a different impact on metabolic 
status in obese versus non-obese populations. Clearly, this assumption 
must be confirmed by investigations in much larger samples. 

Because adipose tissue lipolysis is the first step in lipid removal, we 
investigated the ability of the cyclic AMP system to activate lipolysis in 
vitro in adipocytes isolated from lean and obese individuals and com- 
pared this with in vivo measurements of lipid storage and removal. 
Spontaneous (basal) lipolysis was not related to lipid turnover (Fig. 4a, 
b). However, the stimulated rate of lipolysis was positively correlated 
with triglyceride removal (inversely correlated with lipid age) but was 
not related to the rate of triglyceride uptake (lipid storage). This was 
irrespective of whether lipolysis was induced using a cyclic AMP ana- 
logue (Fig. 4c, d), by activating endogenous adenylate cyclase (using 
forskolin; Supplementary Information 2 and Fig. 3a, b of Supplemen- 
tary Information 2) or by administration ofa synthetic B-adrenoceptor- 
selective catecholamine (isoprenaline; Fig. 3c, d of Supplementary 
Information 2). These data indicate that lipolysis determines lipid 
turnover in adipocytes by regulating the rate of triglyceride removal. 
The impact of subsequent fatty acid oxidation could not be examined 
in this study; however, decreased lipid oxidation is frequently observed 
in obesity****. As there are regional variations in lipolysis and all our 
studies were performed on one fat depot no attempts were made to 
extrapolate findings to the whole-body level. 

Weare in the midst ofa global epidemic of obesity with negative health 
and socio-economic consequences. We propose adipose triglyceride 
turnover as a novel target for the prevention and treatment of excess 
body fat and possibly its consequences for insulin resistance. New 
insights into abnormal triglyceride turnover in FCHL patients may 
also suggest novel treatment strategies for this complex disease that 
targets adipocytes. 


METHODS SUMMARY 

Subjects. Subcutaneous adipose tissue was obtained from two patient cohorts. 
Patient selection and collection of clinical data are described in Supplementary 
Information 2. 

Preparation of lipids. Triglycerides were extracted from pieces of adipose tissue 
or isolated adipocytes. Details of lipid extraction and adipocyte isolation are given 
in Supplementary Information 2. Extracted lipids were subjected to accelerator 
mass spectrometry analysis, as described in Supplementary Information 2. 

Data analysis. Calculations between lipid turnover and clinical or adipocyte phe- 
notypes are described in Supplementary Information 2. Calculations of lipid age and 
net lipid uptake by adipose tissue are described in Supplementary Information 1. 
Conventional statistical methods were used to summarize and compare data. 
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Endonuclease G is a novel determinant of cardiac 
hypertrophy and mitochondrial function 
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Left ventricular mass (LVM) is a highly heritable trait’ and an 
independent risk factor for all-cause mortality’. So far, genome- 
wide association studies have not identified the genetic factors that 
underlie LVM variation’, and the regulatory mechanisms for 
blood-pressure-independent cardiac hypertrophy remain poorly 
understood*°. Unbiased systems genetics approaches in the rat®’ 
now provide a powerful complementary tool to genome-wide asso- 
ciation studies, and we applied integrative genomics to dissect a 
highly replicated, blood-pressure-independent LVM locus on rat 
chromosome 3p. Here we identified endonuclease G (Endog), 
which previously was implicated in apoptosis® but not hyper- 
trophy, as the gene at the locus, and we found a loss-of-function 
mutation in Endog that is associated with increased LVM and 
impaired cardiac function. Inhibition of Endog in cultured 
cardiomyocytes resulted in an increase in cell size and hypertrophic 
biomarkers in the absence of pro-hypertrophic stimulation. 
Genome-wide network analysis unexpectedly implicated ENDOG 
in fundamental mitochondrial processes that are unrelated to 
apoptosis. We showed direct regulation of ENDOG by ERR-a 
and PGCla (which are master regulators of mitochondrial and 
cardiac function)*"", interaction of ENDOG with the mitochon- 
drial genome and ENDOG-mediated regulation of mitochondrial 
mass. At baseline, the Endog-deleted mouse heart had depleted 
mitochondria, mitochondrial dysfunction and elevated levels of 
reactive oxygen species, which were associated with enlarged and 
steatotic cardiomyocytes. Our study has further established the 
link between mitochondrial dysfunction, reactive oxygen species 
and heart disease and has uncovered a role for Endog in maladap- 
tive cardiac hypertrophy. 

Increased LVM is a clinically important trait that independently 
predicts the risk of heart failure, sudden death and all-cause mortality’. 
Although LVM is a heritable complex trait’, large genome-wide asso- 
ciation studies have not identified LVM-associated genes’. Blood- 
pressure-dependent regulation of LVM, which is perhaps surprisingly 
limited’, has been studied extensively in model systems and acts through 
well-characterized and overlapping signalling modules'*. By con- 
trast, the pathways that underlie blood-pressure-independent cardiac 


hypertrophy, which is commonly seen in obesity and type 2 diabetes and 
is associated with mitochondrial dysfunction and lipotoxicity**, remain 
largely unknown. Here we took advantage of the recent step changes in 
integrative systems genetics approaches in the rat*” to dissect a blood- 
pressure-independent cardiac mass quantitative trait locus (QTL) and 
to identify the causative gene and underlying mechanism. 

The rat is unique for the study of cardiac mass, with more than 75 
QTLs identified for this trait (Rat Genome Database; http://rgd. 
mew.edu/). Rat chromosome 3p (0-25 megabase pairs (Mbp)) contains 
a highly replicated and blood-pressure-independent QTL for cardiac 
mass, which has been mapped in crosses of the spontaneously hyper- 
tensive rat (SHR) or the SHR stroke prone (SHRSP) rat to Wistar Kyoto 
(WKY) or salt sensitive (SS) rats'*"*. To dissect this locus genetically, we 
generated an F, intercross of SHR and Brown Norway (BN) strains and 
further replicated the LVM QTL (logarithm of odds (LOD) = 4.2) 
(Fig. 1a). We confirmed the blood-pressure-independent QTL effect 
in a congenic strain (SHR.BN-(3L)) that had a lower LVM and smaller 
cardiomyocytes than the SHR strain (Fig. 1b, c), and we refined the QTL 
region (to chromosome 3, 6.4-11.2 Mbp) using a second congenic 
strain (SHR.BN-(3S)) (Supplementary Fig. 1). In the F, cross, in the 
SHR.BN-(3L) strain and in previous experimental crosses’*"*, the SHR 
allele at the locus was associated with increased cardiac mass, and this 
effect was blood pressure independent (Fig. 1a, b, d). Functional assess- 
ment in vivo revealed that the SHR.BN-(3L) strain had better cardiac 
performance at baseline and after stimulation, compared with the SHR 
strain (Supplementary Fig. 1). These data show that an SHR allele at the 
cardiac mass QTL on rat chromosome 3p increases LVM and adversely 
affects cardiac function. 

We used the new genotypes generated in our F, cross and those 
from previous experiments'*"* to refine the QTL region, and we iden- 
tified five distinct loci (spanning 750 kilobase pairs in total) that co- 
segregated with the haplotypes associated with LVM variation 
(Fig. le). Endog, which we had previously shown to be cis regulated 
in the heart (P= 3 X 10 °)’, was the only gene at these loci that was 
differentially regulated in a consistent direction in the SHR and SHRSP 
hearts compared with the WKY heart (Supplementary Table 1). 
ENDOG is a nuclear-encoded, mitochondria-localized nuclease with 
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Figure 1 | Positional cloning of Endog as the gene underlying the rat 
chromosome 3p cardiac mass QTL. a, Mapping of heart weight (HW) and 
LVM corrected for body weight (BW) to chromosome 3p in the BN X SHR F, 
population. The telomeric limits of the congenic strains (SHR.BN-(3L) and 
SHR.BN-(3S)) and the previously mapped cardiac mass (Cm) QTLs’*"* are 
shown. The x axis indicates the physical position in Mbp, and the dashed lines 
show the limits of the refined QTL. BP, blood pressure. b, HW indexed to BW 
in the SHR (n = 4) and the SHR.BN-(3L) congenic strains (m = 5). ¢, Relative 
cardiomyocyte cross-sectional area in SHR and SHR.BN-(3L) congenic strains. 
d, In vivo telemetric systolic blood pressure (SBP) and diastolic blood pressure 
(DBP) measurements in the SHR (red) and SHR.BN-(3L) (black) strains (n = 8 
per genotype). d, days. e, Haplotype analysis of the refined QTL region. Single 
nucleotide polymorphisms (SNPs) are depicted with reference to WKY alleles 
(grey, identical; and white, dissimilar) with numbers (1-5) denoting the regions 
that are polymorphic between strains with high LVM and low LVM. BN-Lx, 
Brown Norway-Lx; N, unidentified nucleotide. f, g, Quantitative PCR of Endog 


a proposed but disputed function in apoptosis*'*’” and no known 


effect on cardiac mass or function. We observed reduced expression 
of Endog transcripts and lack of ENDOG protein in all strains that had 
increased cardiac mass (Fig. 1f, g). Sequencing of Endog revealed pro- 
moter and coding sequence variation, and we identified an SHR- 
specific, frame-shift-causing insertion in exon 1 of Endog that was 
associated with increased heart weight and LVM (Supplementary 
Fig. 2). There was a marked reduction in cardiac nuclease activity, 
which was ENDOG dependent", in the SHR heart compared with 
the BN heart (Fig. 1h, i). In recombinant inbred strains derived from 
the SHR and BN strains®’, we confirmed the direct relationship 
between the insertion in SHR and the lack of nuclease activity 
(Fig. 1j), and we mapped Endog-dependent nuclease activity to a single 
locus that encodes Endog (Fig. 1k). These data identify Endog as the 
candidate gene at the QTL and implicate Endog loss of function as the 
mechanism for increased cardiac mass and impaired heart function. 
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messenger RNA expression (f) and immunoblotting analysis of ENDOG 
protein expression (g) in strains with low or high cardiac mass at the 
chromosome 3p locus. GAPDH is a loading control. h, Nuclease activity in BN 
and SHR heart extracts over a range of cardiac protein extract amounts (grey 
wedge) (see Supplementary Methods). The first lane shows linearized plasmid. 
i, Reversal of nuclease activity in BN-Lx cardiac lysates by addition of a 
Drosophila-derived inhibitor of ENDOG", CG4930 (range 1,500-1.5 nM, grey 
wedge). The first lane shows linearized plasmid, and the second lane shows 
cardiac lysate in the absence of inhibitor. j, Association of the Endog indel 
(insertion and/or deletion) with loss of ENDOG protein expression and 
diminished nuclease activity in the recombinant inbred strains. Top, centre and 
bottom panels show the DNA indel, protein expression and nuclease activity, 
respectively. k, Linkage mapping of nuclease activity in the recombinant inbred 
strains using a quantitative fluorescence-based assay (see Supplementary 
Methods). Chr, chromosome. All data are represented as mean = s.e.m. *, 
P<0.05; **, P<0.01; ***, P< 0.001. 


We performed immunoblotting across rat and mouse tissues and 
determined that ENDOG was most highly expressed in the heart, 
where it was localized to cardiomyocytes (Fig. 2a, b) and co-localized 
with mitochondria (Supplementary Fig. 3). Using a short hairpin RNA 
(shRNA) knockdown of Endog (shEndog)’’, we tested the effect of 
Endog loss of function in cardiomyocytes and observed an increase 
in hypertrophic biomarkers and cell size in the absence of pro-hyper- 
trophic stimulation (Fig. 2c, d). Conventional blood-pressure-dependent 
hypertrophic signalling pathways’? were not activated in shEndog- 
treated cells, but AMP-activated protein kinase (AMPK) was activated 
(Supplementary Fig. 4), which can induce cardiac hypertrophy”. We 
also observed increased amounts of reactive oxygen species (ROS), 
which are also pro-hypertrophic stimuli*’ that act through multiple 
downstream effectors (Supplementary Fig. 4). These data show that 
Endog loss of function directly induces cardiomyocyte hypertrophy in 
vitro and that this hypertrophy is associated with the activation of two 
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Figure 2 | Endog regulates cardiac hypertrophy. a, Immunoblotting analysis 
of ENDOG expression in mouse and rat tissues (ENDOG, ~30 kDa). 

b, Immunoblotting analysis of ENDOG expression in cardiomyocyte and non- 
cardiomyocyte populations isolated from neonatal rat hearts. c, The size of 
cardiomyocytes (n = 100 cells, n = 3 independent experiments) treated with 
shRNA against Endog (shEndog) or control shRNA (shControl) in the presence 
or absence of the hypertrophic stimulant phenylephrine (PE, 100 1M, 24h). 
d, Expression of the hypertrophic biomarker Anf (which encodes atrial 
natriuretic factor) in shEndog- or shControl-treated cardiomyocytes. 

e, Cardiomyocyte size (see also Supplementary Fig. 5) in Endog ‘~ and wild- 
type (WT) mice at baseline and after angiotensin II (AngII)-induced cardiac 
hypertrophy. f, LVM to tibial length ratio in Endog /~ and WT mice at baseline 
and after stimulation with Angll. c-f, Data are presented as mean + s.e.m. *, 
P<0.05; ***, P< 0.001. 


pro-hypertrophic pathways, both of which have previously been linked 
to mitochondrial dysfunction”. 

We then examined the effects of Endog loss of function in vivo in 
the Endog-deleted (Endog '~) mouse’’, which shows no detectable 
difference in apoptotic phenotypes compared with wild-type mice, 
an observation that was confirmed in an independent Endog-deleted 
strain'®. Compared with controls, Endog ‘~ mice had larger cardio- 
myocytes at baseline (Fig. 2e) in the absence of stimulation, in keeping 
with our observations in the SHR.BN-(3L) rat (Fig. 1c) and in vitro 
(Fig. 2c). Following angiotensin-II-mediated stimulation of hyper- 
trophy, which is largely ROS dependent”’, we observed an increase 
in cardiomyocyte size, hypertrophic biomarker expression and LVM 
in Endog ‘~ mice (Fig. 2e, f and Supplementary Fig. 5). Endog '~ 
mice had blood pressures that were equivalent to those of control mice 
at baseline (P= 0.49) and after stimulation with angiotensin II 
(P = 0.51) (data not shown). Together, our in vitro and in vivo data 
confirm a role for Endog in cardiomyocyte hypertrophy and identify 
ROS as conserved pro-hypertrophic stimuli in both systems. 

Endog has been proposed to be important for apoptotic cell death’; 
however, this was subsequently disputed’®”’, and it was unclear how 
Endog loss of function was associated with cardiac hypertrophy and 
dysfunction. To infer the function of ENDOG in the human heart, we 
carried out genome-wide co-expression network analysis” in a large 
human cardiac expression data set (n = 210) (see Supplementary 
Methods). ENDOG was identified in a network that was highly 
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Figure 3 | ENDOG is co-expressed with a mitochondria-specific gene 
network and is regulated by PGCla and ERR-a. a, The genes (8,490 from 
210 data sets) were clustered and plotted based on the dissimilarity metric 
between their expression profiles (see Supplementary Methods). Low-hanging 
branches in the dendrogram (top) represent groups of genes (modules) that 
have a high similarity metric. Modules are shown beneath the dendrogram 
(centre) and are colour coded. The arrow indicates the module (also boxed) that 
contains ENDOG. In the heat map of the correlations between expression 
profiles (bottom), high and low similarities are coloured yellow and red, 
respectively. b, Weighted gene co-expression network analysis (WGCNA)” for 
the module that contains ENDOG, providing functional annotation through 
cellular localization by Gene Ontology (GO) classification (Supplementary 
Tables 2 and 3). Nodes represent genes, and edges represent significant co- 
expression between genes. The node size is proportional to the relative degree of 
interconnectivity of each gene within the module. c, Quantitative PCR analysis 
of Endog expression in cultured cardiomyocytes after infection with adenovirus 
(ad) expressing green fluorescent protein (GFP) (ad.GFP) or increasing 
amounts of adenovirus expressing Pgcla (ad.Pgcla) (wedge). 

d, Immunoblotting analysis of Endog expression in ad.Pgcla-infected 
cardiomyocytes (top), skeletal muscle of WT mice and transgenic mice 
expressing Pgcla under the control of muscle creatine kinase (MCK-Pgcla) 
(centre), and hearts of WT and cardiac-specific Pgcla-deleted mice 
(Pgcla*“/°) (bottom). e, Endog promoter activity, measured using a luciferase 
reporter, in HEK293 cells infected with ad.Pgcla and/or ad.Erra. f, ERR-o 
ChIP-PCR of two regions of the ENDOG promoter. Red arrows denote primers, 
and ERRE specifies the location of a consensus ERR response element (1,304 
bases upstream of the transcription start site). The experiment was repeated 
three times with similar results, and PCR products were quantified by 
quantitative PCR. 

enriched for mitochondrial genes (P = 2 10 °°) and oxidative meta- 
bolism processes (P = 5 X 10°**) (Fig. 3 and Supplementary Tables 2 
and 3). Taken together, the high levels of Endog expression in meta- 
bolically active organs (Fig. 2a) and in brown fat (Supplementary 
Fig. 6), the unique co-expression of ENDOG with oxidative metabol- 
ism genes, and the link to AMPK signalling and ROS production 
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pointed to an unappreciated effect of Endog in physiological mito- 
chondrial processes. 

Peroxisome proliferator activated receptor-y co-activator lo 
(PGC1«) is widely recognized as a master regulator of mitochondrial 
function” and activates many target genes that are components of the 
ENDOG-associated network (Fig. 3b) through interaction with oestrogen- 
related receptor-o (ERR-«)’. Therefore, we tested whether PGC1za also 
regulates Endog, and we observed robust PGC1a-induced Endog tran- 
script and ENDOG protein expression in cardiomyocytes in vitro 
(Fig. 3c, d). We confirmed the effects of varying Pgcla expression on 
ENDOG protein expression in vivo using mice that overexpressed 
Pgcla under the control of muscle creatine kinase (MCK-Pgcla) and 
in mice in which Pgcla had been deleted specifically in cardiomyocytes 
(Pgcl ahc/Acy (Fig. 3d and Supplementary Methods). Luciferase studies 
revealed strong activation of the Endog promoter by PGC1a and ERR-« 
together (Fig. 3e), and we confirmed direct binding of ERR-« to the 
ENDOG promoter by chromatin immunoprecipitation and PCR 
(ChIP-PCR) in a region containing an ERRA response element 
(P< 0.001) (Fig. 3f). These data show that Endog is a direct target of 
ERR-o and PGC1a, master regulators of mitochondrial and heart func- 
tion, further implicating Endog in mitochondrial and cardiac biology. 

It was apparent that the effects of Endog loss of function on cardiac 
hypertrophy might be mediated through perturbations of mitochon- 
drial physiology, which we therefore examined. Electron microscopy 
revealed lipid-like droplets associated with the mitochondria of 
Endog ‘~ mice, and these droplets were more numerous and larger 
than those seen in control mice (Fig. 4a, b). Molecular studies revealed 
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a marked elevation of triglyceride levels in the hearts of Endog '~ mice 
(Fig. 4c) that manifested as cardiomyocyte steatosis (Fig. 4a and Sup- 
plementary Fig. 7) but was not associated with variation in the expres- 
sion levels of fatty acid metabolism or mitochondrial biogenesis genes 
(Supplementary Figs 8 and 9). Compared with their wild-type 
littermates, Endog " mice had impaired mitochondrial respiration 
and increased ROS production (Fig. 4f, g). 

To assess for mitochondrial depletion, we examined the ratio of 
mitochondrial DNA (mtDNA) to genomic DNA and the mitochon- 
drial protein to tissue weight ratios, which were both diminished in the 
hearts of Endog ‘~ mice (Fig. 4d, e) in the absence of mtDNA struc- 
tural variation (Supplementary Fig. 10). This was an intriguing finding 
given the previously proposed roles for Endog in mtDNA synthesis, 
processing of polycistronic mtRNA and mitochondrial biogenesis”*”®, 
which had subsequently been discarded based primarily on experi- 
ments in Endog-deleted mice’®’’. We re-examined a role for ENDOG 
in mitochondrial biogenesis and demonstrated an increase in mito- 
chondrial mass with chronic ENDOG expression in HEK293 cells 
(P<0.01) and with acute Endog overexpression in a cardiomyocyte- 
derived cell line (P< 0.001) (Fig. 4h—-k) in the absence of an effect on 
apoptotic or necrotic cell death (Supplementary Fig. 11). A role for 
ENDOG in mtDNA biology*° was supported further by ChIP-PCR 
experiments that showed direct binding of ENDOG throughout the 
mtDNA molecule (Fig. 41), as previously demonstrated for mitochon- 
drial transcription factor A (TFAM)”’, which is a crucial determinant 
of mtDNA synthesis and repair that when deleted causes eccentric 
cardiac hypertrophy and heart failure”. 


i H9c2- 
© 
a : 2,500, 16s/ANG1 NDAANG! h g L -jgene 
2,000 
é ‘ KO ry #9 ENDO 
aD * 1 
3 Lae "| |G 8 8 GAPDH 
WT & 1,000 g ¢ = . 
E 500 x= z ma% 218% 
a jit ENDOG — 5 
x vw ~ ~ 
o* w roe vw 8 
= , o 
: S15 B-Actir — 5 
Endog ge pe 
B29 i dave 
Be es > 
cD z GFP and ENDOG 
2g P 
ees Po =HEK293 K a1 (902) 
23 =HEK293-ENDOG = Q2 (H902-Endog) 
232 WT Endog’- 
b f cs 2 
< _ 80, Complex | Complex II 8 8 
8o |—aw — — =, »® ro 
roe) * — ‘Sen. 
3 sid 2= 604 Biko 5 3 
€ as — Ee 
24 5 © a0 2 3 
2 80 ** 
2B 20 52 ‘ 
s B 2 20 > . 
= 0. 6= i Mitochondrial mass Mitochondrial mass 
WT Endog”’- State State State State 
4. 3 4 3 
c g I 100 
IgG 
c 2.0 * Go 
O18 8 basil 5 80 O ENDOG 
& Wr 5 EG 
46 [ko 8 15 £8 oo 
> ke g 5 g 
gz ge 10 ed 40 
3° 2 3a 
2 gq 08 2 20 
6 o c 
oO 0 oc 


TAG _Phospho- Cholesterol 


lipid 
Figure 4 | Endog regulates mitochondrial function and cardiac lipid 
metabolism. a, Transmission electron micrographs (TEM) and oil red O 
stained micrographs (high resolution in Supplementary Fig. 7) of left 
ventricular sections from WT and Endog '~ mice. Vertical arrows indicate 
lipid droplets, and the horizontal arrow indicates a mitochondrion. 

b, Quantification of the number of mitochondria-associated droplets in WT 
and Endog '~ mice. c, Quantification of cardiac triglyceride (TAG), 
phospholipid and cholesterol content in the hearts of WT and Endog ‘~ (KO) 
mice (n = 5). d, Ratio of mtDNA to genomic DNA (gDNA) in the hearts of WT 
and Endog ‘ ~ (KO) mice. e, Quantification of mitochondrial protein content 
in WT and Endog ‘~ mice (n = 5).f, State 3 and state 4 oxygen consumption in 
the presence of substrates for complex I or complex II of the electron transport 
chain in cardiac mitochondria isolated from WT (n = 6) or Endog ' ~ (KO) 
(n = 5) mice. *CS, corrected for citrate synthase activity. g, Relative 
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fluorescence-based measurement of ROS production by mitochondria isolated 
from WT (n = 6) or Endog ‘~ (n= 5) mice. h-k, Representative flow 
cytometric analysis of mitochondrial mass in HEK293 and H9c2 cells 
overexpressing human or rat ENDOG, respectively (n = 4). h, Stable expression 
of ENDOG in HEK293 cells (HEK293-ENDOG). i, Flow cytometric analysis of 
HEK293 and HEK293-ENDOG cells stained with MitoTracker. j, Adenovirus 
(ad)-mediated expression of GFP and rat ENDOG in cardiomyocytes (top), 
and flow cytometric analysis of ad.Endog-infected cells (Q2) and uninfected 
control cells (Q1). Colours denote the event density: from highest to lowest 
density, red, orange, green then blue. k, Number of events plotted against 
mitochondrial mass for ad.Endog-infected (Q2) and control (Q1) H9c2 cells. 
1, Quantitative PCR of mtDNA-protein complexes after ChIP of mitochondrial 
chromatin using anti-ENDOG antibody or IgG. All data are presented as 
mean + s.e.m. *, P< 0.05; **, P<0.01; ***, P< 0.001. 
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Mitochondria are essential for oxidative metabolism, and mitochon- 
drial dysfunction and/or depletion in the heart causes maladaptive 
cardiac hypertrophy and cardiac dysfunction associated with increased 
amounts of ROS and lipotoxicity**”*”’. Here we identified Endog loss 
of function as a primary determinant of maladaptive cardiac hyper- 
trophy that is associated with mitochondrial dysfunction and depletion 
and with marked cardiomyocyte steatosis. The mechanism underlying 
cardiac hypertrophy that results from impaired mitochondrial func- 
tion is not limited to a single pathway, but we demonstrated a con- 
served increase in ROS, which are established hypertrophic stimuli”, 
in Endog loss-of-function models. Our studies resolve some of the 
uncertainty about the non-apoptotic function of Endog'*"” and reveal 
its importance in mitochondrial biology, which has intriguing parallels 
with the dual roles of apoptosis-inducing factor*’. We propose that 
ENDOG, which we show binds to mtDNA, modulates mtDNA syn- 
thesis, maintenance and/or transcription, which is consistent with 
previous hypotheses*”°. Therapeutic targeting of the PGC1A-ERRA 
axis has been proposed to improve mitochondrial function in cardiac 
failure"’, and our studies suggest that regulation of Endog is an import- 
ant component of this process. We conclude that Endog is a novel 
determinant of maladaptive cardiac hypertrophy with previously 
unappreciated mitochondrial functions. 


METHODS SUMMARY 


Linkage mapping was carried out using microsatellite genotypes in the BN X SHR 
F, population. Ex vivo heart weight analysis was performed in the congenic rat 
strains, which were characterized using in vivo blood pressure telemetry. 
Comparative haplotype analysis was performed using single nucleotide poly- 
morphism data (Rat Genome Database; http://rgd.mcw.edu/) for all strains used 
in the QTL mapping studies. Microarray-based expression analysis was conducted 
as described previously*’. Cell size and hypertrophy biomarker expression were 
measured in cardiomyocytes after lentivirus-mediated Endog knockdown. Heart 
weight, hypertrophic biomarker expression and cardiomyocyte size were measured 
in Endog ‘~ mice at baseline and after angiotensin-II-induced hypertrophy. 
Triglyceride abundance, mitochondrial mass and respiratory activity were mea- 
sured in Endog ‘~ mice as described in the Supplementary Information. Weighted 
gene co-expression network analysis (WGCNA)*? was applied to the largest pub- 
licly available human heart transcriptome data set. Regulation of Endog by PGClo 
was investigated in neonatal cardiomyocytes infected with adenovirus expressing 
Pgcla, in MCK-Pgc1a skeletal muscle and in Pgcla*“/“© heart samples. The asso- 
ciation of ERR-« with the ENDOG promoter and the ENDOG-mtDNA interaction 
were determined using ChIP. The histological analysis and electron microscopy of 
Endog ' ~ hearts was carried out to study mitochondrial structure and abundance, 
as well as lipid deposition. Genomic DNA and mtDNA copy number were assessed 
by quantitative PCR. The mitochondrial abundance in cells was studied by flow 
cytometry. Full methods are provided in the Supplementary Methods. 
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Control of flowering and storage organ formation in 
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Seasonal fluctuations in day length regulate important aspects of 
plant development such as the flowering transition or, in potato 
(Solanum tuberosum), the formation of tubers. Day length is 
sensed by the leaves, which produce a mobile signal transported 
to the shoot apex or underground stems to induce a flowering 
transition or, respectively, a tuberization transition. Work in 
Arabidopsis, tomato and rice (Oryza sativa) identified the mobile 
FLOWERING LOCUS T (FT) protein as a main component of the 
long-range ‘florigen’, or flowering hormone, signal'*. Here we 
show that expression of the Hd3a gene, the FT orthologue in rice, 
induces strict short-day potato types* to tuberize in long days. 
Tuber induction is graft transmissible and the Hd3a-GFP protein 
is detected in the stolons of grafted plants, transport of the fusion 
protein thus correlating with tuber formation. We provide evid- 
ence showing that the potato floral and tuberization transitions are 
controlled by two different FT-like paralogues (StSP3D and 
StSP6A) that respond to independent environmental cues, and 
show that an autorelay mechanism involving CONSTANS modu- 
lates expression of the tuberization-control StSP6A gene. 

Potato, the third largest global food crop after wheat and rice, is 
cultivated for its underground storage stems or tubers, which are rich 
in starch and other nutrients. Short days and cool temperatures pro- 
mote tuber formation, ensuring that differentiation of these vegetative 
propagation organs precedes winter. Whereas cultivated potatoes 
derive from Chilean landraces more adapted to long-day conditions, 
Andean types such as the S. tuberosum group Andigena (Solanum 
tuberosum andigena) tuberize only in short days. These plants require 
night periods longer than a critical length, as a pulse of light during the 
night (a ‘night break’) represses tuberization, as seen in strict short-day 
flowering plants®”’. 

Day length is sensed by expanded leaves, which synthesize a mobile 
signal or “tuberigen’ that is transported to the underground stems to 
induce tuber formation*. This long-distance signal shares several fea- 
tures with the mobile florigen, with different pieces of evidence sug- 
gesting that related photoperiodic pathways may control synthesis of 
both signals*. In Arabidopsis thaliana, activation of the FT gene by 
CONSTANS (CO) mediates floral transition in long days”'®. 
Transport of the FT protein from the leaf to the shoot apical meristem 
has been demonstrated in diverse ways''""’, it being broadly accepted 
that this phloem mobile protein functions as the florigen. Closely 
related genes also mediate control of flowering in rice, although in this 
short-day plant the CO orthologue Heading date 1'* (Hd1) activates 
expression of the FT-like Hd3a gene in short days but represses its 
expression under long-day conditions’*"*. 

A CO-dependent pathway is also thought to mediate short-day 
tuberization, as expression of Arabidopsis CO in Andigena plants 
delays tuber formation in short days’. High light irradiance, however, 
induces Andigena potatoes to flower in long days, although these 
conditions are restrictive for tuberization®. This long-day flowering 


response seemed to argue against a role for FT in tuberization, implic- 
ating an additional CO target as the mobile tuberigen. Here we show 
that ectopic expression of the rice Hd3a gene induces Andigena plants 
to flower and tuberize under non-inductive long days, demonstrating 
the potential of FT to act as the mobile tuberigen. We show, in addi- 
tion, that flowering and short-day tuberization responses are regulated 
by two members of the potato FT-like gene family that respond to 
different environmental cues. 

To assess whether FT has a role in tuberization, we transformed 
Andigena plants with the rolC::Hd3a—GFP construct (GFP, green fluor- 
escent protein), which in rice promotes floral transition in long days’’. 
Lines expressing this construct were induced to flower (Fig. 1b, c) and 
were able to tuberize in non-inductive long days (Fig. 1a). When grafted 
to wild-type plants, these lines induced the wild-type controls to tuberize 
in long days, independently of whether they were used as donors (Hd3a 
onto wild type) or as stocks (wild type onto Hd3a), whereas none of the 
control grafts (wild type onto wild type; Fig. 1d) tuberized in non- 
inductive long days. We detected the Hd3a—-GFP protein but not its 
transcript in the stolons of grafted wild-type stocks (Supplementary Fig. 
la, c), demonstrating that the protein but not the RNA can move across 
the graft junction and function as a powerful tuberization inducer. 

Studies in tomato identified six members of the SELF-PRUNING 
(SP) gene family’’ and sequencing of two diploid potato (Tuberosum 
RH89-039-16 and Phureja DM1-3 516 R44) genomes recently led to 
the identification of three additional FT and TFL family members”. 
Phylogeny of these genes grouped the StSP6A, StSP5G, StSP5G-like 
and StSP3D homologues into the same clade as the Arabidopsis FT, 
tomato SINGLE-FLOWER TRUSS (SFT) and rice Hd3a genes 
(Supplementary Fig. 2). Quantitative PCR with reverse transcription 
(RT-PCR) revealed that these transcripts are expressed in leaves or 
stolons, whereas transcripts for SP/TFL1 and MFT members are more 
ubiquitously distributed (Fig. le). Interestingly, StsP6A gene expres- 
sion strongly correlates with tuberization, high levels of expression 
being observed in leaves and stolons of short-day-induced plants 
and in antisense phytochrome B lines, with constitutive tuberization”' 
(Fig. 1f and Supplementary Information, section 1). This expression 
profile suggests that this gene is involved in tuberization control, 
StSP6A overexpression (StSP6Aox) lines being able to tuberize under 
non-inductive long days (Fig. 2a, b) and induced to flower, although 
their flowering phenotype is less severe than in Hd3a lines (Sup- 
plementary Fig. 3a, b and Supplementary Information, section 2). 
StSP6A silencing, in turn, strongly delays tuber formation in short 
days (Fig. 2c, d), pointing to an essential role for this FT-like protein 
in tuberization promotion. StSP6A expression analyses in commercial 
cultivars with early (Jaerla), late (Baraka) and intermediate (Kennebec) 
tuberization periods, in addition, show that accumulation of this tran- 
script in leaves correlates with the tuberization time of these cultivars 
(Supplementary Fig. 6), indicating that this FT paralogue is involved in 
tuberization control even in non-photoperiodic cultivars. 
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Figure 1 | Phenotype of Andigena roIC::Hd3a-GFP lines and expression 
profiles of the potato FT- and TFLI1-like genes. a, Hd3a lines (centre and 
right) tuberize under long-day conditions. Left, wild-type (WT) control. 

b, c, Early-flowering Hd3a plants (c) relative to wild type (b). d, Tuber 
induction in long days of wild-type plants grafted with Hd3a donors (right) or 
stocks (centre). Wild-type control grafts do not tuberize (left). Red lines 
indicate the graft junction. e, Relative levels of expression of the potato FT- 
andTFLI-like genes. 11707, PGSC0003DMG400011707; 16180, 
PGSC0003DMG400016180; 07111, PGSC0003DMG400007111; 14322, 
PGSC0003DMG400014322; Veg., vegetative apex; Inf., inflorescence apex; NB, 
night break; SD, short day. f, Semiquantitative RT-PCR analysis of StSP6A 
expression in leaves and stolons of plants with different tuberization states 
(indicated on top). The pictures show non-induced stolons or tubers. LD, 
long day. 


Day-neutral tomato flowering is regulated by the FT-homologue 
SFT/SP3D gene’. This gene has been reported to be regulated inde- 
pendently of CO and day length’’”’, its potato orthologue being likely 
to have a role in light-irradiance-dependent flowering. StSP3D- 
silenced lines were actually found to be late flowering (Fig. 2e-h), 
although under short-day conditions they tuberized at the same time 
as untransformed controls (Supplementary Fig. 7 and Supplementary 
Information, section 2), indicating that this gene is important in 
flowering but not in the tuberization transition. Remarkably, in modern 
tomato cultivars SP6A was reported to have an extra nucleotide (T at 
position 421) that leads to a premature stop codon, implying a non- 
functional role for this gene”. StSP3D, in turn, retains a weak short-day 
activation response in Andigena, although transcript levels are lower 
than those of StSP6A (Fig. le). Together, these observations suggest that 
members of the FT-like family in Solanaceae diversified such that SFT/ 
SP3D ceased to be regulated by CO”*”’ and became responsive to other 
environmental cues, such as to be important in day-neutral flowering 
control. SP6A function was lost in modern tomato cultivars but in 
potato plays a major part in tuberization, as day-length-dependent 
activation of this gene mediates strict short-day tuberization of 
Andigena species. Thus, the roles of these two FT paralogues in 
flowering and tuberization control have evolved, at least in part, 
through changes in their expression profiles, both genes encoding 
for functionally similar proteins, as indicated by the finding that 
StSP6A expression in Arabidopsis rescues the late-flowering pheno- 
type of co-1 and ft-1 mutants (Supplementary Fig. 8). 
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Figure 2 | Function of the StSP6A and StSP3D genes in short-day- 
dependent tuberization and day-neutral flowering control. a, b, StSP6A 
expression in leaves of StSP6Aox lines (a) and tuber induction in these lines 
(b) in long days. c, d, StSP6A silencing in leaves of StSP6A RNAi (SP6Ai) lines 
(c) correlates with late tuberization in short days (d). e, Relative StSP3D 
expression in StSP3D RNAi (SP3Di) lines. Error bars, s.d. (n > 3). f, Late 
flowering of StSP3D-silenced lines. g, h, Scanning electron microscopy images 
of the StSP3D RNAi apexes: vegetative apex in L26 (g); apex at inflorescence 
transition in L32 (h). Scale bars, 200 tm. 


Interestingly, two other FT-clade members, StSP5G and StSP5G-like 
(Supplementary Table 2), are expressed under non-inductive long-day 
conditions (Fig. le). This expression profile suggests an antagonistic 
function for StSP6A and these two FT paralogues (Supplementary 
Information, section 3), as recently reported for the sugar beet (Beta 
vulgaris) BvFT1 and BvFT2 genes™. Further studies will be required to 
assess whether these FT-clade members have an inhibitory effect on 
tuberization. 

StSP6A is expressed not only in leaves but also in stolons of tuberiz- 
ing plants (Fig. 1f). Expression in these organs is delayed with respect 
to the leaves, suggesting an autoregulatory loop for the transported 
protein. In support of this relay mechanism, increased levels of 
expression of the endogenous StSP6A transcript are observed in 
Hd3a lines (Supplementary Fig. 10) and wild-type stocks grafted to 
these plants (Supplementary Fig. 10a). Thus, in contrast to the 
Arabidopsis FT or rice Hd3a gene, StSP6A is regulated by a relay 
mechanism that sustains synthesis of the inducing signal in stolons. 
In this regard, a local balance between the SP floral repressor and the 
tomato SFT florigen signal has been shown to contribute to the differ- 
ential flowering response of primary and secondary shoot meristems”. 
Hence, it is possible that having more SP in stolons confers a reduced 
sensitivity to FT, which may explain why floral transition is activated 
without tuber formation, whereas Hd3a overexpression activates both 
developmental processes. 

To rule out that tuberization of grafted stock plants is mostly 
mediated by this relay mechanism, we grafted Hd3a grafts onto 
StSP6A RNA interference (RNAi) stocks to test whether inhibition 
of StSP6A expression blocks Hd3a signalling. As shown in Fig. 3a, 
activation of the endogenous StSP6A gene is strongly reduced in 
StSP6A RNAi compared with the wild-type stocks. Tuberization onset 
is delayed and tuber yield reduced in RNAi lines relative to the Hd3a/ 
wild-type grafted controls (Fig. 3b), owing to impaired signal amp- 
lification, but this does not preclude tuberization of the RNAi stocks, 
highlighting induction by the Hd3a protein. 

Finally, we tested whether this regulatory relay requires CO func- 
tion, by grafting StSP6Aox plants into StCOox stocks and analysing 
endogenous StSP6A gene expression (Fig. 3c-e). Like the rice Hdl 
protein’*’*, StCO represses StSP6A gene expression in long days, and 
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Figure 3 | Autoregulatory loop for StSP6A expression. a, Levels of StSP6A 
transcript in transgenic (left) and grafted (right) plants. Hd3a scions were 
grafted onto wild-type or StSP6A RNAi (SP6Ai) lines. b, Tuber induction in 
grafted plants, scored when 100% of the Hd3a/wild-type grafts were tuberizing. 
c, Relative expression of the endogenous StSP6A transcript in COox and 
SP6Aox plants (left), and in grafts with wild-type and COox stocks (right). 

d, Tuber induction of the grafted plants grown in non-inductive long days. 
Error bars, s.d. of two experimental replicates (n > 8). See Supplementary Fig. 
12 for each replicate result. e, StSP6A autoregulation. 


transfer to short-day conditions relieves this repression (Supplemen- 
tary Information, section 4). In line with this function, activation of the 
StSP6A gene is largely repressed in stolons of the StSP6Aox/StCOox 
grafted plants (Fig. 3c), relative to StSP6Aox/wild-type grafts used as 
controls, which implies that StCO is involved in the autoregulatory 
loop that drives StSP6A expression in stolons. Moreover, StSP6A accu- 
mulates only to basal levels in the stolons of StCOox/wild-type and 
wild-type/wild-type grafts used as negative controls (StSP6A is not 
expressed in long days in StCOox or wild-type scions), which implies 
that expression in these organs requires the mobile protein produced in 
the leaves. 

Our data provide the molecular basis for a long-standing physio- 
logical observation, namely that flowering tobacco plants grafted onto 
non-induced potato stocks induce tuberization of the stock plants, 
irrespective of the photoperiodic requirement of the donor plant’®. 
An additional issue is how flowering and tuberization transitions are 
differentially triggered in response to the mobile FT signal. In 
Arabidopsis, FT interacts with FD to activate expression of floral 
meristem identity genes””**. We observed that Hd3a stolons initiate 
floral buds from the apical meristem (Fig. 4a—c) at the same time as 
they differentiate tubers from the subapical region. Using transgenic 
lines expressing the StSP6A protein under control of an ethanol- 
inducible promoter, tuber-specific transcripts were observed within 4h 
of induction in the stolons (Fig. 4d, e and Supplementary Information, 
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Figure 4 | Floral phenotype of Hd3a stolons and proposed model for flower 
and tuber induction. a-c, Flower induction in Hd3a stolons. Scanning 
electron microscope images of wild-type (a) and Hd3a (b, c) stolon apical 
meristems. Scale bars, 100 um. d, Local StSP6A induction activates StGA2ox1 
gene expression in the stolon. e, Activation of other genes reported to be 
induced early during tuber development. Colour bars represent fold change 
(FC) in gene expression in Alc-StSP6A stolons relative to Alc-uiD controls 
(black bars). Alc, ethanol-inducible promoter; error bars, s.d. of two 
experimental replicates (n > 10). f, Model for regulation of flowering and 
tuberization transitions. See Supplementary Information, section 6, for a 
detailed explanation. 


section 5). This very rapid response precludes transport of any mobile 
signal from the leaves and thus supports the notion that the StSP6A 
protein is the mobile tuberigen transported to below-ground organs. 
Thus, it is possible that StSP6A interacts in stolon subapical cells with 
an as yet unknown transcriptional regulator, or that FD-StSP6A dif- 
ferentially activates a set of target genes specific to these cells, such as 
StGA2ox1 (ref. 29; Fig. 4d). A challenge for the future is to identify such 
stolon-specific StSP6A target genes and/or StSP6A-interacting partners, 
to establish how formation of these storage organs is regulated. 

Identification of FT as a switch for tuberization supports the notion 
that FT function extends beyond flowering induction (Fig. 4f). Other 
studies have implicated FT in seasonal control of growth cessation in 
poplar trees*° and meristem growth termination in tomato”. Thus, FT 
is emerging as a key mobile signal controlling not only flowering but 
also a number of other meristem-associated transitions. 


METHODS SUMMARY 

Plant materials and growth conditions. Andigena 7540 wild-type plants and 
antisense phytochrome B*! and AtCOox (ref. 17) lines were grown in a greenhouse 
under long-day conditions. Growth chambers were used for controlled short-day 
(8h light, 16 h dark) and short-day/night-break (short day plus a 30-min pulse of 
light in the middle of the night) treatments. Transgenic Andigena plants were 
generated by Agrobacterium-mediated transformation of leaf explants. Plants 
were grafted as previously described’’ and cultivated under long-day conditions 
to analyse their tuberization response. 

Plasmid constructs. StSP6A was cloned into the pBinAR binary vector to generate 
StSP6A-overexpressing lines. StSP6A, StSP3D and StCO RNAi constructs were 
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generated by inserting the 3’ non-conserved regions of these genes in opposite 
orientations into the pBIN19RNAi vector. The pGWB2 vector was used for StCO 
overexpression. StSP6A ethanol-inducible lines were generated by transformation 
with the pBinSRNA-GW plasmid containing the StSP6A coding region. Primer 
sets used for these constructs are listed in Supplementary Table 3. 

Real-time RT-PCR analyses. First-strand complementary DNA was synthesized 
from 2 jg total RNA and 1 ul of the reaction used for real-time gene expression 
analysis with the SYBR Green PCR master mix (Applied Biosystems). The actin 8 
gene was used for normalization. Identical procedures were used for semiquantita- 
tive RT-PCR, except that amplification was conducted in a Peltier thermal cycler 
(PTC-200, MJ Research). For sets of specific primers and product lengths, see 
Supplementary Table 3. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Plant materials and growth conditions. Andigena 7540 wild-type lines and the 
antisense phytochrome B”' (phyB) and AtCO overexpresser’” (AtCOox) lines were 
grown in the greenhouse under long-day conditions. Growth chambers were used 
for controlled short-day (8h light, 16h dark), short-day/night-break (short day 
plus a 30-min pulse of light given 8 h after the beginning of the dark period) and 
long-day (16h light/8 h dark) treatments. 

Transgenic Andigena plants carrying the rolC::Hd3a-GFP"*, StSP6A over- 
expression, StSP6A RNAi, StSP3D RNAi, StCOox and StCO RNAi constructs were 
generated by Agrobacterium-mediated transformation of leaf explants as 
described previously’. Different graft combinations were obtained as previously 
described’’ and their tuberization phenotype was scored under long-day condi- 
tions. Tuber formation in StSP6A RNAi and StSP3D RNAi lines was analysed by 
growing the plants under long-day greenhouse conditions until a ten-leaf stage, 
before transferring them to short-day conditions. For flowering studies, plants 
were grown under high-irradiance (>200 Es 'm *) long-day conditions. At 
least nine replicates of each line were used for these studies. 

Arabidopsis constans-1 (co-1) and flowering locus t-1 (ft-1) mutants in the 

Columbia (Col-0) and Landsberg erecta (Ler) backgrounds were transformed 
using the floral dip method. Flowering time was measured as the number of rosette 
leaves at floral initiation under long-day conditions, in at least ten individuals. 
Plasmid constructs. The rolC::Hd3a-GFP construct has been described else- 
where'*. Transcripts corresponding to the StSP6A, StSP3D and StCO genes were 
amplified using primers designed on the tomato sequences (AY186737, AY 186735 
and AY490253, respectively). The StSP6A overexpression construct was generated 
by amplifying the protein-coding region and then inserting the PCR product into 
the Smal site of the pBinAR binary vector, between the 35S promoter and the ocs 
terminator. To silence the StSP6A, StSP3D and StCO transcripts, Andigena plants 
were transformed with RNAi constructs designed on the non-conserved region of 
the genes. The amplification products were cloned into the pENTR/D-TOPO 
plasmid (Invitrogen) and inserted in opposite orientations by recombination with 
the LR Clonase II enzyme (Gateway Technology, Invitrogen) into the pBIN19RNAi 
destination vector. The pBIN19RNAi interference vector was generated by partial 
Xbal/HindIII digestion of the pH7GWIWG2(II) plasmid (http://gateway.psb. 
ugent.be/) and insertion of the Gateway cassette fragment into the same restric- 
tion sites of the pBIN19 binary vector. The StCO overexpression construct was 
generated by amplifying and cloning the protein-coding region into the pENTR/D- 
TOPO plasmid (Invitrogen), and further insertion by recombination into the 
pGWB2 destination vector*’. To generate the StSP6A ethanol-inducible lines, the 
pBinSRNA-GW destination vector was created by insertion of the blunt-ended 
pAlcA-R1-R2-t35S cassette from the AlcAP-GW pGreen vector (gift of Patrick 
Laufs, INRA), generated by Xbal digestion, into the blunted HindIII sites of the 
binary vector pBinSRNACatN. The StSP6A coding region was then introduced in 
this destination vector by LR Clonase recombination. A list of the primer sets used 
to generate these constructs is shown in Supplementary Table 3. 
Real-time and semiquantitative RT-PCR analyses. Expression of potato FT- 
and TFL1-like family members was quantified by real-time PCR. First-strand com- 
plementary DNA was synthesized from 2 j1g total RNA and 1 il of the reaction used 
for real-time gene expression analysis with the SYBR Green PCR master mix 
(Applied Biosystems). Quantitative PCR was performed using the Power SYBR 
Green PCR Master Mix (Applied Biosystems) on an ABI7500 Real-Time PCR 
System (Applied Biosystems), following the manufacturer’s instructions. Primer 
pairs were specifically designed for each gene using PRIMER EXPRESS 3.0 software 
(Applied Biosystems) and probed for high-efficiency amplification under standard 
quantitative PCR conditions. All reactions were carried out at least in two inde- 
pendent biological replicates. The Pfaffl** method was used for relative quantifica- 
tion of gene expression. Direct 2~ACr where AC; = Cy (target gene) — Cy (actin) 
(ref. 33), was used to generate the tissue-specific heat map in Fig. le. The compar- 
ative critical threshold (2~44@) method*! was used to analyse relative Hd3a and 
StSP6A expression levels (Supplementary Figs 1 and 10). 

Identical procedures were used for semiquantitative RT-PCR, except that amp- 
lification was conducted in a Peltier thermal cycler (PTC-200, MJ Research) and 
one-tenth of the complementary DNA reaction was used for actin amplification. 
For sets of specific primers and product lengths, see Supplementary Table 3. 
Microarray sample hybridization and analysis. Stolons of wild-type plants were 
collected in long days and two, six and eight days after transferring the plants to 
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short days. Time-course profiling analyses of short-day-induced stolons were 
performed as described previously”. 

Arrays of StSP6A overexpression versus wild type (sampled in long days) and 
StSP6A RNAi stolons versus wild type (sampled in short days, day 6) were per- 
formed with samples from three independent lines. Samples were hybridized 
against POCI (Potato Oligo Chip Initiative) microarrays, and background correc- 
tion and normalization were performed using LIMMA**”. The obtained data was 
statistically checked (false discovery rate, <0.05) and genes with a log ratio change 
of +2 or —2 for StSP6A RNAi and +1.8 or —1.8 for StSP6Aox plants were 
selected. The hierarchical cluster of genes present in both experiments was calcu- 
lated, compared with the short-day profile and represented using the TIGR MEV 
free software (http://www.tm4.org/mev/). Genes found to be differentially 
expressed are listed in Supplementary Table 1. 

Immunoblot analysis. Total protein extracts from stolon tissues were obtained in 
lysis buffer (50mM Tris-HCl, pH 7.5, 150mM NaCl, 1% Na deoxycholate, 0.5% 
Triton X-100, 1mM PMSF and protease inhibitors). For analysis of protein graft 
transmissibility, total extracts were incubated overnight with 10 kl of an anti-GFP 
affinity matrix (MBL). After extensive washing, unbound and bound proteins were 
separated by 10% SDS-polyacrylamide gel electrophoresis, blotted onto nitrocellulose 
and probed with an anti-GFP antibody (Roche). Immunoreactive proteins were 
detected with the SuperSignal West Pico Chemiluminescence kit (Pierce). 
Microscopy. GFP fluorescence was observed on longitudinal stolon sections 
(150 um) obtained with a vibratome (PELCO 101). Fluorescence was excited with 
a 488-nm argon laser and emission images were collected in the 500-600-nm 
range using a Leica TCS SP5 spectral confocal microscope. 

For scanning electron microscopy, potato stolons and apical meristems were 

frozen in an Oxford CT 1500 cryosystem (Oxford Instruments), sublimated under 
vacuum and observed using a DIOL JSM 5410 electronic microscope operating at 
10KV. 
Phylogenetic analysis. Full-length protein sequences, except sequences of the 
potato FT- and TFL1-like family members, were obtained from GenBank. 
Sequences of potato FT- and TFL1-like family members were obtained from the 
Potato Genome Sequencing Consortium Data Release”® (http://potatogenomics. 
plantbiology.msu.edu/) by TBLASTX using as a query previously described tomato 
sequences’” (SISP, no. U84140; SISP6A, no. AY186737; SISP3D, no. AY186735; 
SISP9D, no. AY186738; SISP2I, no. AY186734; and SISP5G, no. AY186736). The 
best genome matches (Supplementary Table 2) were downloaded and open reading 
frames were predicted using FGENESH (http://linux1 .softberry.com/berry.phtml) 
and ORF Finder (http://www.ncbi.nlm.nih.gov/projects/gorf/). A counterpart to 
SISP2I was not found in the potato genome. Phylogenetic analyses were conducted 
using MEGA4**. Sequences were aligned with the COBALT program (http:// 
www.ncbi.nlm.nih.gov/tools/cobalt/) and their evolutionary relationship inferred 
using the neighbour-joining method”. All ambiguous positions were removed for 
each sequence pair. The evolutionary distances were computed using the Poisson 
correction method and the bootstrap consensus tree was inferred from 1,000 repli- 
cates. The accession numbers for the corresponding genes are indicated in the tree 
(Supplementary Fig. 2). 
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Peripheral SMN restoration is essential for long-term 
rescue of a severe spinal muscular atrophy mouse 


model 


Yimin Hua!, Kentaro Sahashi', Frank Rigo’, Gene Hung’, Guy Horev', C. Frank Bennett? & Adrian R. Krainer! 


Spinal muscular atrophy (SMA) is a motor neuron disease and the 
leading genetic cause of infant mortality; it results from loss-of- 
function mutations in the survival motor neuron 1 (SMN1) gene’. 
Humans have a paralogue, SMN2, whose exon 7 is predominantly 
skipped’, but the limited amount of functional, full-length SMN 
protein expressed from SMN2 cannot fully compensate for a lack of 
SMN1. SMN is important for the biogenesis of spliceosomal small 
nuclear ribonucleoprotein particles’, but downstream splicing 
targets involved in pathogenesis remain elusive. There is no effec- 
tive SMA treatment, but SMN restoration in spinal cord motor 
neurons is thought to be necessary and sufficient*. Non-central 
nervous system (CNS) pathologies, including cardiovascular 
defects, were recently reported in severe SMA mouse models and 
patients’ *, reflecting autonomic dysfunction or direct effects in 
cardiac tissues. Here we compared systemic versus CNS restoration 
of SMN in a severe mouse model”. We used an antisense oligo- 
nucleotide (ASO), ASO-10-27, that effectively corrects SMN2 
splicing and restores SMN expression in motor neurons after 
intracerebroventricular injection'’'*. Systemic administration of 
ASO-10-27 to neonates robustly rescued severe SMA mice, much 
more effectively than intracerebroventricular administration; 
subcutaneous injections extended the median lifespan by 25 fold. 
Furthermore, neonatal SMA mice had decreased hepatic Igfals 
expression, leading to a pronounced reduction in circulating 
insulin-like growth factor 1 (IGF1), and ASO-10-27 treatment 
restored IGF1 to normal levels. These results suggest that the liver 
is important in SMA pathogenesis, underscoring the importance 
of SMN in peripheral tissues, and demonstrate the efficacy of a 
promising drug candidate. 

To compare the effectiveness of ASO-10-27 delivered centrally 
versus systemically, we administered an intracerebroventricular 
(ICV) injection of 20 jug ASO-10-27 on postnatal day 1 (P1) to increase 
SMN expression in CNS tissues, or we administered a subcutaneous 
(SC) injection of the ASO on two separate days at 50 1g per g of body 
weight (jg g'), between PO and P3 (two doses). These doses were 
based on our previous studies with this ASO'’’. We also evaluated 
combined ICV and SC injections, as well as repeated SC injections 
(Supplementary Table 1). Control heterozygous mice (Smn*/~ 
SMN2*"°) that received ICV and/or SC ASO-10-27 injections had 
normal survival and behaviour. Severe SMA mice (Smn/~ 
SMN2*"°) that received ICV and/or SC saline injections survived for 
1-2 weeks, with a median survival time of ~10 days, similar to 
untreated mice (Fig. la, Supplementary Figs la and 2a, and 
Supplementary Movie 1). Delivery of the ASO only into the CNS 
efficiently corrected SMN2 exon 7 splicing in the spinal cord and led 
to a striking increase in SMN protein levels, but modestly extended the 
median survival to 16 days, with a single pup surviving for 1 month 
(Fig. la-c and Supplementary Fig. 2b-d). In marked contrast, systemic 
treatment with two SC injections resulted in a median survival of 108 


days (Fig. 1d). Combining ICV and SC injections of the ASO further 
increased the median survival to 173 days, and two additional SC 
injections on P5 and P7, after the initial SC injections at PO-P3, 
extended the median survival to 137 days (Fig. 1d). 

Treated SMA mice varied in size from runts to comparable to their 
heterozygous littermates; their average weight was low, and their tails 
were much shorter than normal (Supplementary Figs 3 and 4). The 
surviving runts slowly gained weight, reaching ~18 g at ~3 months. 
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Figure 1 | Systemic versus ICV ASO-10-27 injections in SMA mice. 

a, Survival curves for mice after ICV administration of ASO-10-27 on P1. 
Administration of 20 pg ASO-10-27 (ICV20, n = 14) or saline (ICVO, n = 18) 
resulted in mean survival times of 17 and 10 days (d), respectively (P < 0.001). 
ASO-10-27-treated heterozygotes (Het-ICV, n = 15) served as controls. 

b,c, Spinal cord RNA and protein samples (n = 3) were analysed on P7 by using 
radioactive RT-PCR (b) or immunoblotting with a monoclonal antibody 
specific for human SMN (hSMN) (c). A7, exon-7-skipped mRNA; FL, full- 
length mRNA; incl, exon 7 inclusion; % incl = 100 * A7/(FL + A7). d, Survival 
curves after SC administration of saline (SCO, n = 26) or ASO-10-27 (SC50, 
n = 12) twice between PO and P3. SC50-SC50 (m = 14) mice received two 
additional SC injections on P5 and P7. Het-SC-ICV (n = 13) and SC50-ICV20 
(n = 18) were heterozygous and SMA mice, respectively, that received 
combined P1 ICV and PO-P3 SC injections. SC-Late (n = 17) were SMA mice 
that received only two SC injections, on P5 and P7. Each SC injection dose was 
50 ugg | body weight. P< 0.0001 for all groups versus SCO except for SC-Late, 
P<0.05. e, Dose-dependent survival after two SC injections at PO-P3 with 40 
(SC40, n = 26), 80 (SC80, n = 18) or 160 (SC160, n = 14) ug g | of ASO-10-27. 
Saline-treated SMA (SCO, n = 23) or heterozygous mice (Het, n = 18) served as 
controls. P< 0.0001 for all groups versus SCO. 
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Most rescued SMA mice could run and climb normally; however, their 
tails and ears developed necrosis and were gradually lost, resembling 
the phenotype of type HI SMA mice (Supplementary Fig. 3e, f). 
Additional delivery of the ASO either by ICV injection on P1 or repeat 
SC injections on P5 and P7 delayed necrosis (Supplementary Fig. 3g). 

To further characterize the effects of the ASO administered system- 
ically, we carried out a dose-response study with 0 (SCO), 40 (SC40), 80 
(SC80) and 160 (SC160) pg g ASO-10-27 as an SC injection, given 
twice between PO and P3. Systemic treatment with the ASO resulted in 
a dose-dependent increase in survival (Fig. le), with the median sur- 
vival increasing from 10 days to 84, 170 and 248 days, respectively. At 
the highest dose tested, the ASO given systemically resulted in long- 
term survival comparable to the best results achieved by adeno- 
associated virus expression of the SMN protein in a slightly less severe 
mouse model'*"!°. Remarkably, 2 of 14 mice in the SC160 group and 2 
of 18 in the SC80 group are still alive and active after >500 days. A 
similar survival benefit was achieved by intraperitoneal administration 
(Supplementary Fig. 5). There was no significant difference in weight 
gain among the three SC-dosing groups; however, mice in the SC160 
group had significantly longer tails (Supplementary Fig. 6a-e). We also 
observed dose-dependent rescue of ear and tail necrosis and dose- 
dependent delays in the development of cataracts and rectal prolapse 
(Supplementary Fig. 6f-h). Administration of the ASO in two doses on 
P5 and P7 resulted in a modest increase in survival, compared with 
earlier treatment (between PO and P3), emphasizing the importance of 
early postnatal therapeutic intervention (Fig. 1d). 

To examine SMN2 splicing changes in various tissues after SC injec- 
tion of the ASO, we performed reverse transcription followed by PCR 
(RT-PCR) on RNA samples from P7 mice. We detected a dose- 
dependent increase in exon 7 inclusion in the spinal cord, brain, liver, 
heart, kidneys and skeletal muscle, with the strongest effect occurring 
in the liver and the weakest in the kidneys (Fig. 2a, b and Supplemen- 
tary Fig. 7a, b). By contrast, ICV administration of the ASO resulted in 
a much more robust change in exon 7 inclusion in the brain and spinal 
cord tissues but had very limited effects in peripheral tissues 
(Supplementary Fig. 8a). Immunoblotting of spinal cord, liver and 
heart tissue samples from mice treated by SC administration showed 
a corresponding increase in full-length SMN protein (Fig. 2c and Sup- 
plementary Figs 7c and 8b). Exon 7 inclusion in the liver significantly 
decreased after P30 (Fig. 2d, e and Supplementary Fig. 9), consistent 
with the measured half-life of the ASO being 22 days in the liver (data 
not shown). These data suggest that transiently increasing SMN 
expression in peripheral tissues during the first few weeks of life has 
a profound effect on long-term survival of severe SMA mice. 

The SMN2-splicing changes were consistent with the ASO distri- 
bution, as assayed by immunohistochemistry, with the apparent excep- 
tion of the kidneys; however, in the kidneys, most of the ASO had not 
been internalized by the cells (Supplementary Figs 10 and 11). We also 
observed some of the ASO accumulating in spinal cord motor neurons 
(Supplementary Figs 10 and 11). The limited distribution of the ASO 
and the moderate SMN2-splicing changes in the CNS after systemic 
administration probably reflect incomplete closure of the blood-brain 
barrier in neonates’” and/or retrograde transport of the ASO. However, 
we detected strong cytoplasmic SMN staining and/or a pronounced 
increase in gem number in spinal cord motor neurons after ICV injec- 
tion of 20 pg ASO-10-27 but not after two SC injections of ASO-10-27 
at 80 ugg * between PO and P3, a dosage that substantially rescued the 
severe SMA mice (Supplementary Figs 12 and 13). Therefore, the effect 
on splicing in the CNS after systemic administration probably contri- 
butes to the extended survival, which is consistent with the combined 
ICV and SC treatment resulting in even better survival than SC admin- 
istration alone (Fig. 1d). However, the striking effects of systemic 
administration on survival in this severe mouse model cannot be 
explained solely by a direct effect on SMN2 splicing in the CNS. 

The rescue of severe SMA mice by systemic administration of, for 
example, histone deacetylase inhibitors or adeno-associated virus 
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Figure 2 | SMN2 splicing and protein expression in mouse tissues after SC 
injection of ASO-10-27. a, Tissue RNA samples from SMA mice were 
analysed by radioactive RT-PCR at P7 after two SC injections between PO and 
P3 with 0, 40, 80 or 160 pg g_* body weight ASO- 10-27. b, Histogram of exon 7 
inclusion data from panel a (n = 3). ¢, Protein samples from P7 SMA mice 
(n = 3) that had been treated with 160 ugg * body weight ASO-10-27 were 
analysed by immunoblotting with a monoclonal antibody specific for human 
SMN (see also Supplementary Fig. 8b). d, RT-PCR of liver RNA from P10, P30 
and P180 SMA mice, showing the decreasing effect of ASO-10-27 over time. 
e, Histogram of data from panel d. b, c, e, Data are presented as mean + s.d. *, 
P<0.05; **, P<0.01; ***, P< 0.0001; all compared with saline controls. 


vectors, has been attributed to the ability of these agents to cross the 
blood-brain barrier’®!®. However, our data indicate that SMN restora- 
tion in peripheral tissues, in combination with partial restoration in the 
CNS, can achieve efficient rescue of severe SMA mice. 

In mice that had been treated systemically with 160 pg g_' ASO-10- 
27 and sacrificed on P9, histological examination of tissues or organs 
associated with SMA revealed striking improvements, consistent with 
the markedly increased survival of mice in the SC160 group. The 
%-motor neuron counts in the spinal cords of these mice were com- 
parable to those of the control heterozygous littermates, and the mean 
area of muscle fibre cross-sections was >80% of that of heterozygotes 
(Fig. 3a, b and Supplementary Fig. 14a). Likewise, the heart weight and 
the thickness of the interventricular septum and the left ventricular 
wall were similar in mice in the SC160 group and their heterozygous 
littermates (Fig. 3c, d and Supplementary Fig. 14b). Finally, staining of 
the neuromuscular junctions (NMJs) showed that NMJ integrity was 
similar in mice in the SC160 group and their heterozygous littermates 
(Fig. 3e). 

Most of the mice that were treated systematically with the ASO 
showed no overt signs of motor dysfunction (Supplementary Movie 2 
and Supplementary Table 2). We used three tests to evaluate behaviour 
and motor function. The first was a rotarod test, which requires limb 
muscle strength, as well as balance and coordination. Three-month-old 
mice in the SC80 and SC160 groups could stay on the rotating rod for 
~12s: that is, for less time than the control heterozygotes but for longer 
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Figure 3 | Evaluation of affected tissues and motor function. Tissues from 
three groups of P9 mice were stained with haematoxylin and eosin: SMA mice 
that had been treated with ASO-10-27 (SC160, two SC injections at 160 ugg * 
body weight at PO-P3, n = 6), saline controls (SCO, n = 6) and untreated 
heterozygotes (Het, n = 6) (see also Supplementary Fig. 14). Saline-treated 
mice were ambulant at P9 and were expected to live for another 3-5 days. 
o-Motor neuron counts in each cross-section of the L1-L2 spinal cord 

(a), mean fibre cross-sectional area (for a total of 200 fibres) of the rectus 
femoris muscle (b), heart weight (c) and thickness of the heart interventricular 
septum (IVS) and left ventricular wall (LVW) (d) significantly improved in 
ASO-10-27-treated mice. e, The arborization complexity of NMJs was restored 
in ASO-10-27-treated mice (red, endplates; green, neurofilament medium). 
Scale bar, 10 um. f, g, P90 SC40 (n = 12, 124 trials), SC80 (n = 13, 137 trials), 
SC160 (# = 11, 117 trials) and untreated heterozygous (Het, n = 12, 135 trials) 
mice were tested three to five times per day for 3 days on a rotarod, using an 
acceleration profile. The mean times for staying on the spinning rod (f) and the 
percentage of no-fall trials and of mice with =1 no-fall trial (g) are shown. 

h, The grip strength (gram-force) of SC160 mice (n = 6) evaluated at 5 and 9 
months reached ~80% of that of heterozygous mice (n = 6). a—d, f-h, Data are 
presented as mean + s.d. *, P< 0.05; **, P<0.01. 


than mice in the SC40 group. Some ASO-10-27-treated mice passed a 
30s acceleration-profile test that many of the heterozygotes failed 
(Fig. 3f, g and Supplementary Movie 3). Considering that SMA is a 
neuromuscular disease, this performance represents a remarkable 
phenotypic improvement. The second test evaluated muscle strength 
in mice from the SC160 group at 5 and 9 months. At both ages, the 
forelimb grip strength of treated SMA mice was ~80% that of the control 
heterozygous mice (Fig. 3h). The final test used HomeCageScan, a video- 
based platform for automated high-resolution behaviour analysis’®. 
ASO-10-27-treated SMA mice performed various behaviours similarly 
to control heterozygous mice, except for rearing, suggesting some 
hindlimb weakness (Supplementary Fig. 15). 

Two observations prompted us to examine the growth hormone 
(GH)-IGF1 axis. First, all severe SMA mice are small*’°””, reflecting 
growth retardation (Supplementary Figs 2a and 3c). Second, the major 
effect of SC injection of the ASO on SMN2 splicing is in the liver, which 
contributes ~75% of the circulating IGF1 (ref. 20). Moreover, restoring 
IGFI expression in the liver is sufficient to support normal postnatal 
growth of Igf1-null mice”. IGF1 is a potent neurotrophic factor”! and is 
also involved in cardiac development and function”. An enzyme- 
linked immunosorbent assay (ELISA) of serum samples from SMA 
mice at P6-P9 showed that IGF1 was undetectable or present at greatly 
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reduced levels compared with the heterozygous controls; SC admin- 
istration of the ASO restored IGF1 to normal levels in SMA mice 
(Fig. 4a). 

RT-PCR showed that the level of hepatic Igfl messenger RNA was 
not reduced in SMA mice compared with the heterozygous controls 
and that it increased from P1 to P5 in both SMA and control mice 
(Fig. 4b). IGF-binding protein, acid labile subunit (IGFALS), which is 
postnatally stimulated by GH, binds to IGF1 and IGF-binding protein 
3 (IGFBP3) to form a stable ternary complex, extending the half-life of 
IGF1 from 10 min to >12h”. The inactivation of Igfals results in low 
levels of circulating IGF1 and IGFBP3, as well as impaired postnatal 
growth”. RT-PCR revealed a marked reduction in the amount of Igfals 
mRNA in the liver of both P1 and P5 SMA mice compared with the 
heterozygous controls; moreover, administration of the ASO rescued 
Igfals expression (Fig. 4c, d). We conclude that the striking reduction 
in serum IGF1 levels in SMA mice is likely to be caused by decreased 
Igfals expression, which correlates with SMN deficiency and SMA 
progression. 

Because Igfals expression is decreased on P1, when the pups are still 
healthy, we propose that the early deficiency in circulating IGF1 may 
be one of the factors that contribute to the pathogenesis of severe SMA 
mice (Supplementary Fig. 1b). Although in several mouse mutants, an 
impaired GH-IGF1 axis results in an increased lifespan”, a severe lack 
of IGF1 may contribute to SMA progression, together with other 
defective factors. Consistent with our hypothesis, two recent studies 
have shown that a local increase in IGF1 in either the spinal cord or 
muscle increases the survival of severe SMA mice*”’. Indeed, disrup- 
tion of the IGF1 system is a common feature of neurodegenerative 
diseases, including Alzheimer’s disease and amyotrophic lateral scler- 
osis (ALS)”’. Igf1-null mice also show some phenotypic similarity to 
SMA mice, such as small size and severe and generalized muscle dys- 
trophy (including of the diaphragm and heart), with most of them 
dying at birth*®. Moreover, dysregulation of the IGF1 receptor and its 
downstream signalling pathway has been observed in patients with 
type I SMA”. However, the results of IGF1 therapy for ALS are not 
consistent between mice and humans*'””. In light of this inconsistency, 
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Figure 4 | The IGF1 system is disrupted in SMA mice. Treated SMA mice 
(SC80) received two SC injections of ASO-10-27 at 80 1g g + body weight 
between PO and P3. a, IGF1 serum levels in P6 and P9 SMA mice (SCO), as 
measured by ELISA (mean of three measurements per sample), were strikingly 
lower than for their heterozygous littermates (Het) or treated SMA mice 

(P < 0.001 for all samples). b, Total liver RNA from P1 and P5 SMA mice and 
their heterozygous littermates was analysed by radioactive RT-PCR to measure 
Igfl, Igfals and Igfbp3 expression, with Gapdh as a control. c, ASO-10-27 
treatment restored hepatic Igfals expression at P5, as shown by radioactive RT- 
PCR. d, Quantification of hepatic Igfals expression (n = 5). *, P< 0.01 versus 
samples from heterozygous littermates or ASO-10-27-treated SMA mice. 

a, d, Data are presented as mean + s.d. 
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it will be crucial to determine the extent to which SMA mouse models 
accurately mimic human SMA. This will further refine our under- 
standing of the mouse models and influence the development of ther- 
apeutics and clinical treatments for SMA. 


METHODS SUMMARY 


ASO- 10-27 was synthesized as described previously" and dissolved in 0.9% saline. 
The severe SMA mouse model was generated from a type III SMA mouse model, as 
described previously””°. All mouse protocols were in accordance with the Cold 
Spring Harbor Laboratory’s Institutional Animal Care and Use Committee guide- 
lines. Treated mice (control and ASO-10-27) were provided with additional gel 
food. The procedures for neonatal ICV injection, tissue sample collection, RT- 
PCR, western blotting and the human-specific anti-SMN antibody (SMN-KH) 
were as described previously''. The primers for the gene expression analysis are 
shown in Supplementary Table 3. Serum IGF1 levels were analysed with a Mouse/ 
Rat IGF-I Quantikine ELISA kit (R&D Systems). 

Mouse spinal cords, quadriceps and hearts were fixed and stained with haema- 
toxylin and eosin as described previously’*. «-Motor neurons were counted in 
serial 10-20-11m cross-sections of the lumbar (L1-L2) spinal cord. The muscle 
fibre cross-sectional area was calculated by using AxioVision LE velocity software. 
NMJ staining in toto was performed as described previously”’. 

For the rotarod (AccuScan Instruments) test, a four-phase profile was used: 
phase 1, from 1 to 10 r.p.m. in 7.5 s; phase 2, from 10 to 0 r.p.m. in 7.5 s; phase 3, 
from 0 to 10 r.p.m. in 7.5s in the opposite direction; and phase 4, from 10 to 0 
r.p.m. in 7.5s. A grip-strength meter (Columbus Instruments) was used for the 
gripping test. Mice were allowed to grasp a triangular bar with their forelimbs and 
were pulled back horizontally. The test was repeated five times for each mouse, and 
the highest value was recorded as the grip force for that animal. 

Statistical significance was analysed by two-tailed Student’s t-tests. Kaplan- 
Meier survival data were analysed with Mantel-Cox tests using the program 
GraphPad Prism. 
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Pathogenic exon-trapping by SVA retrotransposon 
and rescue in Fukuyama muscular dystrophy 
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Fukuyama muscular dystrophy (FCMD; MIM253800), one of the 
most common autosomal recessive disorders in Japan, was the first 
human disease found to result from ancestral insertion of a SINE- 
VNTR-Alu (SVA) retrotransposon into a causative gene’*. In 
FCMD, the SVA insertion occurs in the 3’ untranslated region 
(UTR) of the fukutin gene. The pathogenic mechanism for 
FCMD is unknown, and no effective clinical treatments exist. 
Here we show that aberrant messenger RNA (mRNA) splicing, 
induced by SVA exon-trapping, underlies the molecular pathogen- 
esis of FCMD. Quantitative mRNA analysis pinpointed a region 
that was missing from transcripts in patients with FCMD. This 
region spans part of the 3’ end of the fukutin coding region, a 
proximal part of the 3’ UTR and the SVA insertion. Corres- 
pondingly, fukutin mRNA transcripts in patients with FCMD 
and SVA knock-in model mice were shorter than the expected 
length. Sequence analysis revealed an abnormal splicing event, 
provoked by a strong acceptor site in SVA and a rare alternative 
donor site in fukutin exon 10. The resulting product truncates the 
fukutin carboxy (C) terminus and adds 129 amino acids encoded 
by the SVA. Introduction of antisense oligonucleotides (AONs) 
targeting the splice acceptor, the predicted exonic splicing enhancer 
and the intronic splicing enhancer prevented pathogenic exon- 
trapping by SVA in cells of patients with FCMD and model mice, 
rescuing normal fukutin mRNA expression and protein produc- 
tion. AON treatment also restored fukutin functions, including 
O-glycosylation of a-dystroglycan (a-DG) and laminin binding by 
a-DG. Moreover, we observe exon-trapping in other SVA inser- 
tions associated with disease (hypercholesterolemia‘, neutral lipid 
storage disease’) and human-specific SV A insertion in a novel gene. 
Thus, although splicing into SVA is known* *, we have discovered in 
human disease a role for SVA-mediated exon-trapping and demon- 
strated the promise of splicing modulation therapy as the first 
radical clinical treatment for FCMD and other SVA-mediated 
diseases. 

FCMD (incidence 1/34,000 births) shares phenotypic similarities 
with other severe muscular dystrophies, including muscle-eye-brain 
disease and Walker-Warburg syndrome. All show deficiencies in 
O-glycosylation of %-DG, an extracellular protein anchored on the 
plasma membrane. Insufficient O-glycosylation interferes with the 
ability of «-DG to interact with extracellular matrix proteins such as 
laminin®”°. For this reason, FCMD, muscle-eye-brain disease and 
Walker-Warburg syndrome are categorized as ‘u-dystroglycanopa- 
thies (a-DGpathy 10. so far, no effective treatments exist for these 
conditions. SVA is a hominid-specific, composite non-coding retro- 
transposon that contains SINE (short interspersed sequence), VNTR 
(variable number of tandem repeat), and Alu sequences. It is still active 


in humans, polymorphic and mobilized by the human LINE-1 in 
trans*', 

In previous work, we showed that fukutin mRNA (10 exons, 7.4- 
and 6.4-kilobase (kb) cDNAs in size with two poly-A sites, 461-amino- 
acid protein with calculated molecular mass of 53.7 kDa) was not 
detectable by northern blot analysis in patients with FCMD carrying 
the SVA insertion’. To investigate the aetiology of this decreased 
expression, we have now analysed whole fukutin mRNA in lympho- 
blasts from patients with FCMD using quantitative PCR with reverse 
transcription (qRT-PCR). PCR products corresponding to the 
protein-coding region of fukutin, as well as those including sequences 
in the distal part of the 3’ UTR (and thus downstream of the SVA 
insertion), were similar in abundance to those from an unaffected 
control (Fig. la). However, products located at sequence positions 
within the 3’ UTR were markedly decreased relative to the control. 
From these results and along with previous reports of many 3’ and 5’ 
splice sites within SVA elements*®*, we hypothesized that abnormal 
splicing occurs somewhere between the end of the fukutin protein- 
coding region and the SVA insertion. 

We then performed long-range RT-PCR using primers that flank 
the region corresponding to decreased expression. In patients with 
FCMD, we detected a single 3-kb PCR product, which is shorter than 
the 5-kb product seen in the normal control (Fig. 1b). This observation 
was consistent in several tissue types from patients with FCMD 
(Supplementary Fig. 1). PCR from genomic DNA produced an 8-kb 
product in patients with FCMD, compared with a 5-kb product in the 
control (Fig. 1b). Sequence analysis of the 3-kb product from FCMD 
cDNA revealed a splicing event (Supplementary Fig. 2). This event 
generates a new donor-side breakpoint within the final coding exon 
(exon 10), located 116 base pairs (bp) upstream from the authentic 
stop codon. A rare alternative donor site at that position is activated 
and trapped by an alternative acceptor site located within the inserted 
SVA, creating an additional and aberrant exonic sequence (exon 11) 
(Fig. 1c). The acceptor-side breakpoint is located 274 bp downstream 
from the start of the SVA insertion, between ag and TC (Fig. 1c). The 
acceptor site has not been described in the previous reports of SVA 
splicing®’. This location is preceded by a pyrimidine-rich stretch, the 
SVA (TCTCCC) 4; hexamer at the 5’ end of the SVA element, with a 
possible favourable branch point. Predicted exonic splicing enhancer 
sites occur around 70 bp downstream from the new acceptor site. We 
confirmed that the aberrant splicing event can be abolished by repla- 
cing AG with GG at the acceptor junction in cultured cells transfected 
with a fukutin construct carrying SVA insertion (Supplementary Fig. 3). 
Fukutin expression was not altered by cycloheximide treatment, indi- 
cating that the transcript was not subject to nonsense-mediated mRNA 
decay, possibly because this exon-trapping occurred within the last 
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Figure 1 | An SVA retrotransposal insertion induces abnormal splicing in 
FCMD. a, Expression analysis of various regions of fukutin mRNA in 
lymphoblasts. Grey bar, the ratio of RT-PCR product in patients with FCMD 
relative to the normal control; numbers on the x axis, nucleotide positions of 
both forward and reverse primers in fukutin. Error bars, s.e.m. b, Long-range 
PCR using primers flanking the expression-decreasing area (nucleotide 
position 1,061-5,941) detected a 3-kb PCR product in FCMD lymphoblast 
cDNA (open arrow) and an 8-kb product in FCMD genomic DNA (filled 
arrow). In the normal control, cDNA and genomic DNA both showed 5-kb 
PCR products. The 8-kb band was weak, probably because VNTR region of 


exon, and the new stop codon exists downstream of the new last exon- 
exon junction (Supplementary Fig. 4). 

We have recently generated knock-in mice that carry a humanized 
fukutin exon 10, which either includes (Hp allele) or excludes (Hn 
allele) the SVA insertion, and bred these strains with heterozygous 
fukutin knockout mice to obtain compound heterozygotes (Hp/—)"*. 
Knock-in mice that are homozygous (Hp/Hp) and compound hetero- 
zygous (Hp/—) are representative of the human FCMD alleles. These 
mice exhibit hypoglycosylation of %-DG in skeletal muscle, which is 
the most significant characteristic in a-DGpathy'®. Quantitative RT- 
PCR in various tissues from Hp/Hp mice revealed an aberrant splicing 
pattern identical to that seen in human patients (Supplementary 
Fig. 5). Northern blot analysis detected abnormally spliced fukutin 
mRNA species at the expected sizes of 5.6 and 4.6 kb in patients with 
FCMD, whereas the normal fukutin mRNAs appeared at 7.4 and 6.4 kb 
(Fig. 1d and Methods). We replicated these results in the knock-in 
model mice (Fig. le and Supplementary Fig. 6a). The consistent obser- 
vations between patients with FCMD and knock-in model mice lead us 
to conclude that a splicing abnormality underlies the pathogenesis of 
FCMD. 

Abnormal splicing excises the authentic stop codon and produces 
another stop codon located 388 bp downstream from the 5’ side of the 
new exon 11 (Fig. 1c). The predicted protein lacks the C-terminal 38 
amino acids of fukutin, instead containing 129 amino acids derived 
from the SVA sequence (Supplementary Fig. 7). Endogenous fukutin is 
scarce and difficult to detect; however, we were able to identify both 
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SVA is GC-rich (82%). ¢, Representation of genomic DNA and cDNA in 
FCMD. Black and white arrows, forward and reverse sequencing primers. The 
intronic sequence in FCMD is indicated in lower case. The authentic stop 
codon is coloured red, and the new stop codon is coloured blue. d, e, Northern 
blot analysis of fukutin in human lymphoblasts (d) and model mice (e); F, 
FCMD;N, normal control. The wild-type mouse fukutin mRNA was detected 
at a size of 6.1 kb. Both skeletal muscle (left) and brain (right) showed smaller, 
abnormal bands in Hp/Hp mice. WT, wild type; Hn, Hn/Hn mice; Hp, Hp/Hp 
mice. f, Representation of gnomic DNA and cDNA in ARH (LDLRAPI, left), 
NLSDM (PNPLA2, middle) and human (AB627340, right). 


normal and aberrant forms of the protein in human and mouse using 
immunoprecipitation followed by western blot analysis. The abnormal 
fukutin protein in FCMD displayed the predicted mobility shift 
(Fig. 2a-c and Supplementary Fig. 6b). 

We introduced normal and aberrantly spliced fukutin cDNA con- 
structs into mammalian cell lines. Whereas normal fukutin localized to 
the Golgi apparatus, the aberrantly spliced fukutin protein is displaced 
completely from the Golgi to the endoplasmic reticulum (Fig. 2d and 
Supplementary Fig. 8). Further examination showed that a fukutin 
construct lacking the C-terminal 38 amino acids also mislocalized to 
the endoplasmic reticulum (Fig. 2d and Supplementary Fig. 8), sug- 
gesting that the C-terminal domain of fukutin is important for local- 
ization to the Golgi. Thus, impairment of this domain may lead to 
fukutin dysfunction in FCMD. The mislocalization is unlikely to be 
toxic because FCMD is an autosomal recessive disease and heterozyg- 
ous carriers of the SVA insertion have no symptoms. 

We next tested if exon-trapping occurs in other diseases with SVA 
insertion’. In a patient with autosomal recessive hypercholesterolemia 
(ARH), a 2.6-kb SVA was inserted within intron 1 of the LDLRAPI 
gene’. A patient with lipid storage disease with subclinical myopathy 
(NLSDM) also had a 1.9-kb SVA insertion in exon 3 of the PNPLA2 
gene®. We found abnormally spliced products induced by SVA exon- 
trapping in these patients’ fibroblast (Fig. 1f left and middle panels, 
Supplementary Figs 9 and 10, and Supplementary Table 1). Cyclo- 
heximide treatment to fibroblasts from these patients increased 
expression of the genes (Supplementary Figs 9a and 10a), suggesting 
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Figure 2 | Abnormal fukutin protein in FCMD. a-c, Immunoprecipitation 
analysis of fukutin protein in human lymphoblasts (a), both skeletal muscle and 
brain tissues from Hp/Hp mice (b) and brain tissue from patients with FCMD 
(c); filled arrow, abnormal fukutin; N, normal sample; F, sample from patient 
with FCMD; Hn, Hn/Hn mice; Hp, Hp/Hp mice; PI, pre-immune serum; D, 
patient with Duchenne muscular dystrophy. d, The subcellular localization of 
fukutin. Top, normal fukutin; middle, mis-spliced fukutin; bottom, truncated 
fukutin. Stained with anti-FLAG (left, to detect fukutin), anti-GM130 (middle, 
Golgi marker, top) and anti-KDEL (endoplasmic reticulum marker, middle 
and bottom), and merge (right, with DAPI stain). Scale bar, 10 jum. 


that the SVA-trapped transcripts are likely to be subjected to non- 
sense-mediated mRNA decay®””. In a search for the same events using 
the same acceptor site as FCMD in the human genome, we located two 
expressed sequence tags on human chromosome 4 (DA436529 and 
DA060755) that represent a spliced transcript induced by an SVA 
element. We found exonization in a human-specific insertion of 
SVA (AB627340) into a small gene (Fig. 1f right panel and Sup- 
plementary Fig. 11). The human-specific exon-trapping of SVA in 
the small gene might influence human evolution and development. 

FCMD alleles of the fukutin gene contain a fully intact protein 
coding sequence, raising the possibility that FCMD could be treated 
by restoring translation of the full-length protein through splicing 
modulation with AONs. To identify promising target sequences in 
various cell lines, we produced 25-mer 2'-O-methyl phosphoramidite 
(2’OMePS) AONS targeted to the acceptor (A1-A3), donor (D1-D5) 
and exonic splicing enhancer sites (E1-E4) in fukutin pre-mRNA 
(Supplementary Fig. 12). We introduced the AONs into various cell 
types and assessed the recovery of normal processing and restoration 
of the authentic stop codon (Fig. 3a). Cells with A3 and E3 showed 
strong suppression of SV A-derived splicing. The greatest recovery of 
fukutin mRNA, to levels of more than 40% of the normal control, was 
achieved with a combination of A3, E3 and D5 (AED) (Fig. 3a). The D5 
sequence overlaps with a predicted intronic splicing enhancer site 
within the aberrant intronic sequence; in normal fukutin, this 
sequence resides in exon 10 (Supplementary Fig. 12). 
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Figure 3 | AON cocktail rescues normal fukutin mRNA. a, RT-PCR 
diagram of three primers designed to assess normal fukutin mRNA recovery 
(upper). Black arrow, a common forward primer located on fukutin coding 
region; dark grey arrow, a reverse primer to detect the abnormal RT-PCR 
product (161 bp); light grey arrow, the other reverse primer to detect the 
restored normal RT-PCR product (129 bp). The effect on Hp/Hp ES cells 
treated with each single or a cocktail of AONs (lower). F, FCMD; N, normal 
sample. b, Rescue from abnormal splicing in VMO-treated Hp/Hp and Hp/— 
mice. Local injection of AED cocktail into tibialis anterior (n = 3). Dys, a 
negative control. c, Rescue from abnormal splicing in VMO-treated human 
FCMD lymphoblasts (left, 1 = 2) and myotubes (right, n = 2). The y axis shows 
the percentage recovery of normal mRNA (*P < 0.01 by Student’s t-test). TA, 
tibialis anterior. Error bars, s.e.m. 


We injected octa-guanidine morpholino oligonucleotide (vivo- 
morpholino, VMO)'* AED cocktail locally into skeletal muscle of 
knock-in mice and evaluated the therapeutic effect by calculating the 
percentage recovery of normally processed mRNA. In the AED-treated 
tibialis anterior and gastrocnemius of Hp/Hp and Hp/— mice, the 
amount of corrected fukutin mRNA increased significantly relative to 
mice treated with control VMO (Fig. 3b and Supplementary Fig. 13). We 
assessed fukutin protein recovery in injected skeletal muscle tissue from 
Hp/Hp mice. Consistent with the significant increase of restored normal 
mRNA, normal fukutin protein was rescued (Fig. 4a). We examined 
a-DG glycosylation in AED-treated Hp/— mice. Deficiently glycosy- 
lated «-DG, at the predicted smaller size, was reduced in abundance, 
whereas normal-sized «-DG increased after AED treatment (Fig. 4b). 
The signal intensity for glycosylated «-DG was clearly increased, anda 
shift in the «-DG core was observed, indicating that the rescued 
fukutin is functional. Laminin overlay assays revealed a marked 
increase in o%-DG laminin-binding ability, indicating that «-DG 
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Figure 4 | AON cocktail treatment rescues normal fukutin protein and 
functional o-DG. a, d, Immunoprecipitation analysis of fukutin protein after 
local treatment with VMO (AED) in FCMD model mice (a) and human FCMD 
lymphoblasts (d). Arrow, normal fukutin protein. L, left tibialis anterior; R, 
right tibialis anterior; Dys, negative control. b, c, e, Tibialis anterior muscle after 
local (b) or systemic (c) treatment with AED and human FCMD lymphoblasts 
treated with the AED (e) were analysed by western blot using antibodies against 
a-DG core protein (top panel) and glycosylated «-DG (second), and by a 
laminin overlay assay (third). Bottom, B-DG (internal control). f, Laminin 
clustering assay. Left, anti-laminin; middle, anti-glycosylated «-DG; right, 
merged images. Upper, normal myotubes treated with control VMO; middle, 
FCMD patient myotubes treated with control VMO; bottom, FCMD patient 
myotubes treated with AED. 


function also is recovered (Fig. 4b). We next tested systemic AED 
treatment by intravenous injection of Hp/— mice. This treatment also 
showed the recovery of normally glycosylated «-DG in AED-treated 
mice (Fig. 4c). 

We administered the VMO AED cocktail to human lymphoblasts 
and myotubes. As in knock-in mice, we observed successful correction 
of the splicing abnormality. The corrected fukutin mRNA was restored 
to 50% or more of the levels seen in normal controls (Fig. 3c). We 
believe this to be sufficient recovery, considering that unaffected 
FCMD carriers have only 50% of normal fukutin mRNA. Finally, we 
tested recovery of the fukutin protein and the glycosylation of «-DG in 
the cells of patients with FCMD. Not only was normal fukutin protein 
expression significantly rescued in AED-treated lymphoblasts 
(Fig. 4d), but also we observed recovery of normally glycosylated 
a-DG in AED-treated myotubes (Fig. 4e). Immunofluorescence stain- 
ing also showed immensely increased glycosylated «-DG (Fig. 4f). A 
laminin clustering assay showed increased laminin clustering ability, 
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which is characteristically absent in o-DGpathy”” (Fig. 4f). These data 
show that AED treatment effectively rescues normal fukutin, confirm- 
ing our observation of abnormal fukutin splicing and raising the pos- 
sibility of splicing modulation therapy as the first treatment for FCMD. 
To treat neuronal migration disorder of FCMD, prenatal treatment 
may be necessary, but it is currently difficult for ethical and technical 
reasons. Nevertheless, improving even only the muscular symptoms 
would greatly ameliorate quality of life of the patients as well as their 
families. 

Retrotransposons account for nearly half of the human genome”. 
Increased numbers of reports have highlighted positive and negative 
contributions of retrotransposons to human health and disease”!”*. In 
addition to being the causative factor for FCMD, ARH and NLSDM, 
SVA insertions have also been implicated in hereditary elliptocytosis, 
X-linked agammaglobulinemia, neurofibromatosis type 2 and X-linked 
dystonia-Parkinsonism’****. It has been suggested that SVA inser- 
tions cause such diseases through genomic deletion, reduced mRNA 
expression or skipping of neighbouring exons'””’. Recently, SVA splic- 
ing has been suggested to generate variation within and across species 
by activating functional 3’ splice sites within SVAs across the human 
genome, controlling gene transcription, creating alternative splicing by 
exon-trapping, or inducing premature stop codons, and was experi- 
mentally demonstrated®. Our findings emphasize the importance of 
SVA functions in human disease and support the possibility of radical 
treatment against SVA-induced disease by splicing modulation ther- 
apy. AONs have become one of the most promising and practical 
candidate chemicals for splicing modulation therapy in cancer”, infec- 
tious diseases** and Duchenne muscular dystrophy”*”’. In demonstrat- 
ing the ability of AONs to rescue fukutin function in FCMD, we 
introduce a novel clinical role for them in treating FCMD and other 
SVA-mediated diseases, while providing new insights about the influ- 
ence of SVAs on human evolution, development and disease. 


METHODS SUMMARY 


For AON treatment, 25-mer 2’OMePS (GeneDesign and Invitrogen) and octa- 
guanidine morpholino (VMO; Gene-Tools) were used. The knock-in mouse was 
produced as described previously’®. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Antisense oligonucleotides. Twenty-five-mer 2’'OMePS (GeneDesign and 
Invitrogen) and VMO oligonucleotides (Gene-Tools) were designed to target 
potential splice-modulating sequences of SVA-inserted fukutin, including a splic- 
ing acceptor site, a splicing donor site, exonic splicing enhancers and intronic 
splicing enhancers as follows: Al, CCGTGGAAGGAGACTGTGGAGGGAG; 
A2, GGAGACCGTGGAAGGAGACTGTGGA; A3, AGAGGGAGACCGTGGA 
AGGAGACTG; El, CACCGTCCAGCCTTGGCTCGGCATC; E2, CTGCAGTG 
AGCCGAGATGGCAGCAG; E3, GAGGCAGGAGAATCAGGCAGGGAGG; 
E4, GAAAACCAGTGAGGCGTAGCAGGCT; D1, CAGGTCTTACCATAGT 
TGGCTTCAA; D2, CAGGAATCTTCCAGGTCTTACCATA; D3, GAGCGCTT 
CCAGTCCCACGTCTTTA; D4, TCCATTGGGTTGCACATTGGGAGGA,; D5, 
CATCCCACTCAGAAATAGGCCAGAT; DYS, GGCCAAACCTCGGCTTAC 
CTGAAAT*:. U (uracil) was used instead of T (thymine) for the synthesis of 
2'O-MePS oligonucleotides. Target sequences are shown in Supplementary Fig. 12. 
Exonic splicing enhancer sites were predicted by ESEfinder 3.0 (http://rulai.cshl.edu/ 
cgi-bin/tools/ESE3/esefinder.cgi), and intronic splicing enhancer sites were predicted 
by ACESCAN2 (http://genes.mit.edu/acescan2/index.html). AONs were solubilized 
in sterile distilled water. 

Animals and cells. All mouse experimental protocols were approved by the Ethics 
Review Committees for Animal Experimentation at Osaka University Graduate 
School of Medicine and Kobe University Graduate School of Medicine. FCMD 
knock-in model mice and the mouse nomenclature have been described previ- 
ously’*®. The transgenic alleles containing normal and SVA-inserted human exon 
10 were named Hn and Hp, respectively: Hp/Hp is homozygous for the SVA allele; 
Hn/Hn is homozygous for the normal allele; Hp/+ and Hp/— are SVA carriers 
and compound heterozygotes for the SVA and knockout alleles, respectively. The 
ages of mice used in experiments varied from 2 to 6 months. The mouse ES cell line 
carrying the SVA-inserted human genomic fukutin exon 10 was generated from 
Hp/Hp mice. The ES cell line carrying a fukutin knockout allele has been described 
previously**. The commercially available mouse ES cell line AB2.2 was used as a 
control. Human lymphoblasts were obtained from patients with FCMD with 
homozygous SVA insertions and from unaffected individuals. Human primary 
myoblasts were derived from muscle biopsies from patients with FCMD and 
unaffected individuals. Human primary fibroblasts were obtained by skin biopsy 
from patients with ARH and NLSDM. Human autopsy brain samples were 
obtained from patients with FCMD (fetus and 34-year-old) and DMD (34-year- 
old). Chimpanzee brain sample was provided by the Great Ape Information 
Network, Japan. Human brain RNA was purchased from Clontech. All clinical 
samples were used with the approval of Human Ethics Review Committees of 
Osaka University Graduate School of Medicine and Kobe University Graduate 
School of Medicine. 

Myoblast differentiation. Myoblast cells were maintained at 37 °C and 5% CO) in 
DMEM medium plus 20% fetal bovine serum, 2.5 ng ml! of basic fibroblast 
growth factor (Sigma), and a 0.5% penicillin-streptomycin-amphotericinB mix 
(Wako). Myotubes were obtained from confluent myoblast cultures after 10-14 
days of serum deprivation and replacement with 2% FBS. 

RNA isolation, RT-PCR, qRT-PCR and sequencing. To inhibit nonsense- 
mediated mRNA decay, cycloheximide (100 pg ml’) (Sigma) was added to the 
culture medium 24h before RNA isolation. For RT-PCR and qRT-PCR, total 
RNA was extracted using the RNeasy Plus Mini kit (Qiagen), and cDNA was 
obtained using the Superscript III One-step RT-PCR system (Invitrogen) with 
random primers, following the manufacturer’s instructions. SYBR Pre-mix Ex Taq 
(Takara) was used for RT-PCR, and expression values were normalized to gapdh 
as an internal control for mRNA quantity. Data were obtained from triplicate 
experiments. To detect abnormally spliced RT-PCR products from patients with 
FCMD, ARH and NLSDM, and from human brain AB627340 cDNA, long-range 
PCR was performed using LA Taq with LA Taq Buffer II (Takara), adding 
dimethyl sulphoxide and 7-deaza-dGTP (Roche). The RT-PCR products were 
directly sequenced (FCMD and NLSDM), or cloned with the TOPO TA 
Cloning Kit (Invitrogen) before sequencing (ARH and AB627340). To calculate 
the expression ratio in Fig. la and Supplementary Figs 4, 5, 9, 10 and 13, the value 
in the mutant sample was divided by the value in the normal sample, as measured 
by qRT-PCR. To identify AON target sequences, we designed three primers to 
distinguish recovered transcripts from unrecovered transcripts by AON treatment 
(Fig. 3a). Similarly, we designed three primers to compare expression amount of 
SVA-trapped to SVA-untrapped transcripts of the AB627340 gene (Supplemen- 
tary Fig. 1la). One primer on SVA in Fig. 3a and Supplementary Fig. 11a was 
within Alu-like domain: the sequence was 5'-GAAAACCAGTGAGGCGTAGC-3’. 
To calculate the percentage recovery of normal mRNA processing in Fig. 3b, c and 
Supplementary Fig. 13, the value of treated sample was divided by that of normal 
samples, as measured by RT-PCR at sequence position 1341, where the authentic 


stop codon resides. Primer sequences for (RT-PCR and RT-PCR are available 
upon request. 

Northern blot analysis. Previous attempts to detect fukutin mRNA in patients 
with FCMD by northern blot analysis have been unsuccessful’, probably because 
the predicted mRNA sequence is the same size as abundant ribosomal RNA. 
Moreover, the tertiary structure of fukutin mRNA is presumably complicated 
owing to the immensely GC-rich SVA sequence. Therefore, we performed northern 
blot analysis of FCMD and control mRNA after treatment to remove abundant 
ribosomal RNA and strong denaturation to untangle the fukutin transcript. Total 
RNA (1 mg) was extracted from human lymphoblasts, mouse ES cells, mouse brain 
and mouse skeletal muscle using TRIzol (Invitrogen). Oligotex-dT30<Super> 
(Takara) was used to extract more than 3 1g of poly-A RNA. Ribosomal RNA 
was removed using Ribo-Minus (Invitrogen). Stronger denaturation of RNA was 
achieved by incubating polyA-RNA samples with a combination of 0.8 M glyoxal 
and 50% DMSO in 10 mM sodium phosphate buffer (pH 7.0) for 60 min at 55 °C. 
Three micrograms of poly-A RNA was loaded on the agarose gel. A fukutin CDNA 
clone covering the fukutin coding sequence was **P-labelled and used as a probe. 
cDNA expression constructs. The normal fukutin cDNA encodes full-length 
fukutin protein. The spliced fukutin construct encodes abnormal fukutin, as 
shown in Supplementary Fig. 7. The truncated fukutin construct lacks the 
C-terminal 38 amino acids. All constructs encoded FLAG epitope tags fused to 
the C terminus of the expressed protein. 

Cell transfection. HeLa $3 cells and C2C12 cells were transfected with normal 
fukutin construct, spliced fukutin construct and truncated fukutin construct using 
FuGENE 6 (Roche). Fukutin localization was determined using immunocyto- 
chemistry 2 days after transfection. For transfection of AONs, 2'OMePS were 
introduced into various cell lines, including mouse ES cells, human myoblasts 
and human lymphoblasts, using Lipofectin (Invitrogen). 

Detection of endogenous fukutin protein. The polyclonal rabbit anti-fukutin 
antibody RY213 recognizes the peptide CLKIESKDPRLDGIDS, and the polyclo- 
nal goat-anti-fukutin antibody 106G2 recognizes full-length fukutin protein lack- 
ing the amino (N)-terminal hydrophobic domain. Endogenous fukutin was 
detected by immunoprecipitation using 106G2, from cell or tissue lysates contain- 
ing 5-10 mg of total protein in lysis buffer (1% Nonidet P-40, 0.5% deoxycholate, 
0.1% SDS, 20 mM Tris-Cl, pH 7.5 and 150mM NaCl), followed by western blot 
analysis using affinity-purified RY213. 

Immunofluorescence and western blot analysis. Cells were washed and fixed 
with 4% paraformaldehyde in PBS. The following primary antibodies were used: 
anti-GM130 (monoclonal, BD Bioscience), anti-KDEL (monoclonal, Stressgen), 
anti-FLAG (rabbit polyclonal, MBL), anti--DG (monoclonal, IH6C4 and 
VIA4-1, Millipore) and anti-laminin (rabbit polyclonal, Sigma). To stain nuclei, 
4'6-diamidino-2-phenylindole (DAPI, Sigma) was added to the secondary 
antibody solution at a final concentration of 1 ng ml '. Cells were observed under 
fluorescence confocal microscopy (Carl Zeiss). Western blot analysis and laminin 
overlay assays were performed as described previously’’. 

Mutagenesis analysis. We made the four fukutin constructs: pHn, human normal 
fukutin construct consisting of exon 2-9 cDNA and genomic normal exon 10; 
pHp, patient fukutin construct consisting of exon 2-9 cDNA and genomic patient 
exon 10 with SVA insertion; pSpl, patient fukutin construct pHp, which lacks the 
abnormally spliced region; pAcc, patient fukutin construct pHp with AG to GG 
replacement at the acceptor site within the SVA sequence. These constructs were 
transfected into HeLa S3 cells using Effectene (Qiagen). After extraction of poly-A 
RNA by Oligotex, northern blot analysis was performed using 2 1g of poly-A RNA 
for each sample with stronger denaturation mentioned above. 

AON treatment of FCMD model mice. For intramuscular injection, we injected 
cardiotoxin (101M) (Latoxan) percutaneously into tibialis anterior (0.3 nmol) 
and gastrocnemius (0.7 nmol) of Hp/+, Hp/—, Hp/Hp and Hn/Hn mice on 
day 0 (n= 3 for each genotype). On days 1, 4 and 7, VMO (400 mgkg~') solu- 
bilized in sterile distilled water was injected. AED and Dys were administered to 
the left and the right legs, respectively. For systemic injection, an intraperitoneal 
injection of butorphanol tartrate (5mgkg ') (Bristol-Myers Squibb) was per- 
formed on day 0. VMO (20mgkg ') solubilized in 5% glucose solution was 
administered by intravenous injection through the tail vein on days 1 and 7 
(n= 4 for Hp/—, n= 2 for Hp/+). Mice were killed on day 21, and total RNA 
or protein lysate was isolated from each tissue for further analyses of fukutin 
mRNA expression, fukutin protein translation, and glycosylation of ¢-DG. 
AON treatment of human patient cell lines. For protein analysis, VMO cocktails 
(AED and Dys) were introduced into FCMD and normal control lymphoblasts at a 
final concentration of 2.5,1M in culture medium using a Gene Pulser II 
Electroporator (0.25-kV voltage, 960-uF capacitance, with 0.4-cm gene pulser 
cuvettes, giving a time-constant readout of approximately 40ms) (Bio-Rad) 
(n = 2). For glycosylation analysis, VMO cocktails (AED and Dys) were introduced 
into myoblasts from patients with FCMD and normal control cells by direct 
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addition to the culture medium at a final concentration of 4uM (n = 2). After 
incubation for 48 h, cells were collected and total RNA or protein lysate was isolated. 
Laminin clustering assay. The AED cocktail was introduced into myotubes by 
direct addition to the culture medium at a total concentration of 4M after a 
medium change on day 2. On days 10-14, mouse EHS laminin-1 (Sigma) was 
added with fresh medium at a concentration of 1.0nM and incubated for 30 min, 
followed by immunocytochemistry. 

SVA sequence analysis. SVA sequence was aligned to the SVA reference sequence 
present in Repbase (http://www. girinst.org/repbase/update/index.html)** and the 
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location on the SVA reference of the splicing acceptor and donor sites in SVA was 
determined. 
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ATP-induced helicase slippage reveals highly 


coordinated subunits 


Bo Sun'**, Daniel S. Johnson'*+*, Gayatri Patel?, Benjamin Y. Smith'*, Manjula Pandey*, Smita S. Patel? & Michelle D. Wang’ 


Helicases are vital enzymes that carry out strand separation of 
duplex nucleic acids during replication, repair and recombina- 
tion'’. Bacteriophage T7 gene product 4 is a model hexameric 
helicase that has been observed to use dTTP, but not ATP, to 
unwind double-stranded (ds)DNA as it translocates from 5’ to 3’ 
along single-stranded (ss)DNA”°. Whether and how different sub- 
units of the helicase coordinate their chemo-mechanical activities 
and DNA binding during translocation is still under debate’. 
Here we address this question using a single-molecule approach 
to monitor helicase unwinding. We found that T7 helicase does in 
fact unwind dsDNA in the presence of ATP and that the unwinding 
rate is even faster than that with dTTP. However, unwinding traces 
showed a remarkable sawtooth pattern where processive unwind- 
ing was repeatedly interrupted by sudden slippage events, ulti- 
mately preventing unwinding over a substantial distance. This 
behaviour was not observed with dTTP alone and was greatly 
reduced when ATP solution was supplemented with a small 
amount of dTTP. These findings presented an opportunity to 
use nucleotide mixtures to investigate helicase subunit coordina- 
tion. We found that T7 helicase binds and hydrolyses ATP and 
dTTP by competitive kinetics such that the unwinding rate is dic- 
tated simply by their respective maximum rates V,,,,, Michaelis 
constants Ky and concentrations. In contrast, processivity does 
not follow a simple competitive behaviour and shows a cooperative 
dependence on nucleotide concentrations. This does not agree 
with an uncoordinated mechanism where each subunit functions 
independently, but supports a model where nearly all subunits 
coordinate their chemo-mechanical activities and DNA binding. 
Our data indicate that only one subunit at a time can accept a 
nucleotide while other subunits are nucleotide-ligated and thus 
they interact with the DNA to ensure processivity. Such subunit 
coordination may be general to many ring-shaped helicases and 
reveals a potential mechanism for regulation of DNA unwinding 
during replication. 

Despite the fact that most motor proteins use ATP as a fuel source, 
previous bulk studies have shown that T7 helicase does not unwind 
DNA efficiently in the presence of ATP, although it is capable of ATP 
hydrolysis***. To investigate why ATP seemed not to support T7 
helicase unwinding, we used a single-molecule optical trapping assay 
that we previously developed to measure unwinding of dsDNA or 
translocation on ssDNA (Fig. 1a and Supplementary Fig. 1)’. Briefly, 
two strands of a DNA fork junction were held under tension that was 
not sufficient to mechanically unwind the junction without a helicase. 
Helicase unwinding of the junction resulted in an increase in the 
ssDNA length, permitting tracking of the helicase location. When 
experiments were conducted with 2mM ATP, we were surprised to 
find that ATP supported not only dsDNA unwinding but that it also 
supported it at a significantly faster rate than with dTTP (Fig. 1b-c). 
However, processive unwinding was interrupted by slippage events, 
resulting in a remarkable sawtooth pattern in the unwinding trace 


(Fig. 1b). Control experiments verified that each trace was the action 
of a single helicase (Supplementary Fig. 2). We attribute this pattern to 
helicase losing its grip on the ssDNA, sliding backwards under the 
influence of the reannealing DNA fork, and then regaining its grip 
and resuming unwinding (Fig. 1d). In contrast, slippage behaviour was 
essentially absent with 2mM dTTP alone (Fig. 1b). These results 
resolve the mystery of the apparent lack of significant unwinding 
activity seen in bulk studies*®*; unwinding and slippage could not 
be separated, so unwinding was masked by unobservable slips that 
prevented helicase from moving over a substantial distance. Our work 
is the first direct observation, to our knowledge, of helicase nucleotide- 
specific slippage. Previous studies of non-ring-shaped helicases have 
reported reverse motions of the unwinding fork attributable to helicase 
reaching the end of the DNA or encountering a barrier’”’, dissociating 
from the DNA”, or moving in the reverse direction”’*"’. These are of 
a somewhat different nature than what we have observed. The only 
slippage behaviour that may resemble ours is from non-helicase bac- 
teriophage motors’*"’, but their slippage is not a result of the use of a 
specific nucleotide. 

Slippage was not observed with dTTP alone (Fig. 1b) and therefore 
seems to be sensitive either to the base composition of the bound nuc- 
leotide (for example, adenosine versus thymidine) or the type of sugar 
(ribose versus deoxyribose). We compared slippage for all four NTPs and 
their dNTP counterparts (Supplementary Fig. 3). For each nucleotide we 
measured processivity, defined as the mean distance between slips 
(Supplementary Fig. 4). The results indicate that the additional 
2'-OH group on the ribose sugar makes the helicase more prone to 
slipping. Examination of the helicase structure at the nucleotide-binding 
pocket'® reveals that the 2’-OH group of a bound nucleotide may dis- 
place the -OH group on the side chain of residue Y535 (Supplementary 
Fig. 5a). We thus generated a Y535F mutant to remove the -OH group 
and it showed significantly increased processivity in the presence of ATP, 
albeit still less than that seen for dATP (Supplementary Fig. 5b). 

Although ATP caused helicase to slip more frequently, it supported a 
much faster unwinding rate between slips, consistent with an earlier 
finding of a faster rate of ATP hydrolysis'’. Because ATP and dTTP 
support different unwinding rates and processivities, we used nucleo- 
tide mixtures to understand how multiple subunits of the helicase 
coordinate unwinding activity. We approximated the in vivo concen- 
trations of ATP and dTTP of Escherichia coli’* by using 2.0 mM ATP 
anda small amount of dTTP, 0.2 mM (Fig. 1b, c). Although the unwind- 
ing rate between slips was close to the value observed with 2mM ATP 
alone, the processivity increased by approximately threefold. When the 
converse experiment was performed (0.2 mM ATP and 2.0 mM dTTP), 
the unwinding rate was comparable to that with 2 mM dTTP alone and 
minimal slippage was observed (Fig. 1b, c). These results imply that 
even a small fraction of helicase subunits, when bound with dTTP, 
reduce slippage and substantially increase processivity. This finding 
was further substantiated by bulk experiments using ATP alone, and 
an ATP/dTTP mixture (Supplementary Fig. 6). To determine if T7 
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Figure 1 | Comparison of helicase unwinding behaviours with different 
nucleotides. a, Schematic of the single-molecule configuration (not to scale). 
The single-stranded ends of a dsDNA were held at a constant unzipping force of 
8 pN while T7 helicase unwound the dsDNA by translocating on ssDNA. 

b, Representative traces showing the number of unwound base pairs versus 
time in the presence of various concentrations of nucleotides. For clarity, traces 
have been arbitrarily shifted along both axes. c, A summary of unwinding rates 
and processivities. Uncertainties are s.e.m. d, Cartoon illustrating slippage 
behaviour. The helicase unwinds, loses grip, slips, re-grips and resumes 
unwinding. Dotted helicase indicates a previous location of the helicase. 


helicase binds DNA with different affinities in the presence of dTTP and 
ATP, bulk binding studies were carried out using fluorescence aniso- 
tropy with dTTP and ATP analogues (Supplementary Fig. 7). The 
results show that T7 helicase binds ssDNA 100-fold more tightly with 
dTMPPCP than with AMPPCP, and indicate that the greater slippage 
in the presence of ATP is probably due to weaker binding to DNA. 
The discovery of helicase slippage and the ability to directly measure 
helicase processivity provided a unique opportunity to explore the fol- 
lowing: (1) how ATP and dTTP compete for binding to helicase sub- 
units; (2) how nucleotide binding regulates helicase affinity to DNA; 
and (3) how multiple subunits of helicase coordinate their activities. 
To understand how ATP and dTTP compete for binding to helicase 
subunits, we determined the unwinding rates between slippage events 
(Fig. 2a) as a function of nucleotide concentration. For each nucleotide 
alone, the unwinding rate followed Michaelis-Menten-like kinetics, 
yielding Vinax and Ky values that were both higher for ATP than for 
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Figure 2 | Helicase unwinding kinetics. a, Example of unwinding with ATP to 
illustrate the method of determining unwinding rate by analysing data between 
slips. b, Kinetic constants for unwinding under a constant unzipping tension of 
8 pN in the presence of either ATP (right) or dTTP (left). For each nucleotide, Ky 
and Vinax were obtained by fitting the unwinding rates as a function of NTP 
concentration to the Michaelis-Menten equation. c, Measured unwinding rates at 
either fixed [dTTP] and varying [ATP], or fixed [ATP] and varying [dTTP], and 
comparison with direct predictions (not fits) from the competitive nucleotide 
binding model using kinetic constants Ky, and Vinax Shown in b. Error bars 
indicate s.e.m. d, Kinetic pathway of a competitive binding model where ATP and 
dTTP compete for binding and hydrolysis by the helicase (denoted by E here). 


dTTP (Fig. 2b). These kinetics indicated that there was no cooperativity 
in NTP binding and hydrolysis. Next, we conducted experiments in 
which the concentration of one nucleotide was fixed while that of the 
other nucleotide was varied. The resulting unwinding rates could be 
explained by competitive kinetics: ATP and dTTP compete for binding 
based on their respective affinities and the resulting reaction rate is 
determined by their concentrations, Vinay» and Ky (Fig. 2c, d; Methods 
Summary and Supplementary Discussion). A comparison of unwind- 
ing rates with mixed nucleotides and direct predictions (not fits) from 
the competitive binding kinetics showed excellent agreement. These 
results were further substantiated by ssDNA translocation rate experi- 
ments (Supplementary Fig. 8). This also explains why in Fig. 1b, c the 
unwinding rate was minimally altered when 0.2mM of dTTP was 
added to 2mM ATP. Under those conditions, only about 16% of the 
nucleotide bound to the helicase hexamer was dTTP. 

The competitive binding kinetics for nucleotides, however, does not 
explain the observed slippage behaviour with mixed nucleotides 
(Fig. 1b, c). That is, it is unclear how the 16% bound dTTP resulted 
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in a threefold increase in processivity. If only a single nucleotide can be 
bound by the helicase at a time and the type of the bound nucleotide 
determines the helicase’s affinity to the DNA, then processivity should 
only increase by 7% (Supplementary Discussion). In addition, it has 
previously been shown that the helicase subunits do not bind to 
ssDNA in the absence of a nucleotide’’. However, we found minimal 
slippage even at [dTTP] much below its Ky. These observations indi- 
cate participation of multiple subunits in both nucleotide and DNA 
binding, where each subunit would have a nucleotide-specific DNA 
binding affinity. Our data indicate that helicase may not slip if at least 
one subunit of the hexamer is in a deoxythymidine-ligated state, which 
has a higher affinity for the DNA. 

Two models may be consistent with this idea. In an uncoordinated 
model’’, each helicase subunit functions independently in its nucleo- 
tide binding/hydrolysis, and DNA binding/release (Supplementary 
Discussion). Conversely, coordinated models have been proposed 
for T7 helicase’”’, but details of the coordination remain unclear. 
Biochemical and structural studies indicate that nucleotide hydrolysis 
may occur sequentially around the hexameric ring'®*°”, that roughly 
four subunits are nucleotide-ligated at any given time”’, and that DNA 
binding to the helicase might involve one-to-two helicase subunits’*”° **. 
A model based on structural studies has been proposed for ring-shaped 
helicases El] (ref. 23) and Rho”, where all or some of the subunits 
coordinate their chemo-mechanical activities (Fig. 3d). Coordination 
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Figure 3 | Processivity dependence on nucleotides and a proposed 
coordinated model. a, An example of unwinding with ATP to illustrate the 
method of determining distance between slips. b, c, Measured processivity 
(mean distance between slipping events) as a function of [ATP] alone, and as 
functions of [dTTP] at two fixed concentrations of ATP. Note processivity 
increased substantially when a small amount of dTTP was added to the 
reaction. Solid lines are global fits using the coordinated model, yielding 

n= 5.2 + 0.4. For comparison, fits using n = 2 are also shown. Error bars 
indicate s.e.m. d, An interpretation of the proposed coordinated model. Each 
subunit is uniquely labelled with a different colour and has a potential 
ssDNA-binding site (small red dot). Nucleotide binding and subsequent 
hydrolysis occur sequentially around the ring. If a subunit is nucleotide-ligated 
(the state of hydrolysis indicated by Ni), it has a non-zero probability of being 
bound to ssDNA. During unwinding, the leading subunit can bind to a 
nucleotide (N) and thus acquire affinity to the upstream ssDNA. This 
stimulates the last nucleotide-bound subunit to release its nucleotide and 
ssDNA. Then the cycle proceeds again around the ring. Slippage occurs when 
all subunits simultaneously release ssDNA, as determined by the joint 
probability of detachment for all subunits (Supplementary Discussion). 
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could occur sequentially around the hexameric ring with the leading 
subunit poised for NTP binding and each successive subunit having a 
bound nucleotide in states of progression along the chemical reaction 
pathway (NTP, NDP + Pi, NDP, and so on). Depending on the state 
and type of nucleotide bound each subunit may have a different affinity 
to DNA. Once the leading subunit binds to an NTP and reels in the 
DNA, the remaining subunits progress to their next reaction states. 
Product release by the last participating subunit results in release of 
DNA from that subunit, and thus completes a single cycle. 

We formulated quantitative descriptions for the uncoordinated and 
coordinated models (Supplementary Discussion). The observed rate of 
unwinding as a function of [ATP] or [dTTP] is consistent with both 
models, which predict an apparent Michaelis-Menten-like kinetics. 
The observed unwinding rate with ATP and dTTP mixtures is also 
consistent with the competitive binding kinetics for both models as 
long as, in the case of the coordinated model, the rates are treated as 
averages over time (Supplementary Discussion). Although the two 
models cannot be distinguished based on rate measurement studies, 
they do yield different predictions for DNA slippage behaviour. The 
uncoordinated model (Supplementary Discussion) requires that each 
subunit binds and hydrolyses nucleotides independently with an affinity 
to DNA dependent on the state and type of nucleotide bound. This 
model is not consistent with the processivity data taken with mixed 
nucleotides at concentrations near or lower than their respective Ky, 
values (Supplementary Fig. 9). 

On the other hand, the coordinated model requires that subunits 
participating in coordination bind and hydrolyse nucleotide in coordi- 
nation, with only one subunit poised to bind a nucleotide at a time and 
with each subunit having an affinity to DNA dependent on the state and 
type of nucleotide bound. This model predicts that processivity should 
increase linearly with [NTP] in the presence of a single type of NTP. 
Indeed, our data show that the processivity increases linearly with 
increasing [ATP] (Fig. 3a, b). If multiple helicase subunits coordinate 
in their chemo-mechanical activities, what is the degree of coordination 
as measured by the number of participating subunits at any given time 
(n)? This is a key parameter that characterizes the mechanism of the 
helicase. Previous studies indicate that only one or two subunits are 
involved in significant DNA binding, suggesting a lower degree of 
coordination of n = 1 or 2 (refs. 16, 20-22). However, subunits may 
participate in the coordination even if they have lower affinity to ssDNA. 
The coordinated model formulated (Supplementary Discussion) is 
rather general and naturally takes this into account. Interestingly, it 
predicts that processivity sensitively depends on n as [dTTP] is 
increased in the presence of a fixed [ATP]—the larger n, the more 
subunits participate in DNA binding, and the more steeply processivity 
increases with [dTTP]. Therefore we measured processivity with mix- 
tures of ATP and dTTP (Fig. 3c). A global fit to the processivity data in 
Fig. 3b, c yielded n = 5.2 + 0.4 (Methods Summary). In contrast, n = 2 
does not agree with the measurements. These findings are further sub- 
stantiated by experiments using UTP instead of ATP (Supplementary 
Fig. 10, n = 5.0 + 0.3), experiments under a different unzipping force 
(Supplementary Fig. 11, n = 5.4 + 0.3), and data on time between slips 
(Supplementary Fig. 12, n = 5.5 + 0.4). Because n = 6 is expected for a 
hexamer, this finding indicates that nearly all subunits participate in the 
coordination (n = 5 or 6) (Fig. 3d). Our findings suggest that only one 
subunit at a time can accept an incoming nucleotide, while the rest of the 
subunits are already nucleotide bound and coordinate to prevent slip- 
page and maintain high processivity. 

The work presented here provides a quantitative description of 
nucleotide binding/hydrolysis and its coupling to DNA binding and 
translocation for T7 helicase. This was possible because unwinding 
and slippage events are clearly distinguishable in single-molecule 
traces. The slippage behaviour is explained by a multiple-site coordi- 
nated model. For helicase to slip, all six subunits must simultaneously 
lose their grip on the DNA. This happens more often when helicase 
subunits are bound only to ribose nucleotides. Our data demonstrate 
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that T7 helicase has a very weak DNA binding affinity in the presence 
of ATP but the addition of a small amount of dTTP to the ATP 
reaction increases the binding affinity of helicase to DNA. As a con- 
sequence, the presence of a single deoxythymidine-ligated subunit 
significantly decreases the chance of slippage so that helicase can still 
effectively unwind dsDNA with ATP. Thus T7 helicase, like most other 
helicases’, could still use ATP as a main power source in vivo, under 
conditions such as those during phage infection of E. coli’* where ATP 
is most abundant. ATP could be used for rapid unwinding and dTTP 
for high processivity. Although we focus here on a comparison of 
dTTP with ATP for helicase unwinding, other deoxyribose nucleotides 
may also reduce the frequency of slippage (Supplementary Fig. 3). We 
speculate that slippage may also provide an evolutionary advantage for 
replication: when dNTP concentrations are low, slippage can slow 
down helicase to allow its synchronization with a slow-moving DNA 
polymerase. 


METHODS SUMMARY 


Single-molecule assays were performed as described previously’. If dTTP and ATP 
compete for binding to helicase according to the kinetic pathway outlined in Fig. 2d, 


ATP dTT 
then the resulting unwinding rate is: Viot ATE a Vol ar) / 
ATP dTTP 
(1 orm) where for each type of nucleotide Ky = Rooks and 
M M 


Vinax = Sky with s being the step size (in nucleotides) (see Supplementary 
Discussion). In the presence of dTTP and ATP, if helicase subunits coordinate 
in their chemo-mechanical activities and DNA binding, then the resulting distance 
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between slips (processivity) is: dprocessivity ( 
[ATP] /KAT a 
[ATP] /Kx'? + [dTTP] /KaTTP 
proportionality constant. This expression was used to fit data in Fig. 3b, c with c 
and n as fit parameters. 


(Supplementary Discussion), with c being a 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Protein and DNA preparations. Wild-type T7 helicase (gp4A') and Y535F 4A’ 
were expressed and purified as described previously'’. A 5.2kb DNA was con- 
structed as described elsewhere”, with minor modifications. Briefly, a ~1.1 kb 
anchoring segment was prepared by PCR from pRL574 using a diogoxigenin- 
labelled primer, and then digested with BstXI (NEB) to produce a 3 bp overhang. 
A ~41kb unzipping/translocation/unwinding segment was derived from 
pCP681 by digestion with Earl (NEB) and ligated to a biotin-labelled 37 bp seg- 
ment lacking a 5’ phosphate on the distal end. The anchoring segment and 
unzipping segment were then ligated, with a nick due to the missing phosphate. 
For ssDNA translocation experiments (Supplementary Fig. 8), the ~4.1 kb seg- 
ment was capped with a hairpin (5'-TAGGGCGACCTAGCTCTATGCTAGG 
TCGCC-3’). 

Single-molecule assays. Sample preparation was similar to that previously 
described’. Briefly, helicase was prepared by first incubating 2 1M of the helicase 
monomer for 20min in the unwinding buffer. This solution was then further 
diluted to obtain the final experimental concentration of helicase monomer, nucleo- 
tides and MgCl. DNA tethers were formed by first non-specifically coating the 
sample chamber surface with anti-digoxigenin (Roche), followed by an incubation 
with digoxigenin tagged DNA. Streptavidin-coated 0.48 um polystyrene micro- 
spheres were then added to the chamber. Finally, helicase solution was flowed in 
just before data acquisition. The helicase unwinding buffer was 20 mM Tris-HCl 
(pH 7.5), 3mM EDTA, 0.02% Tween 20, 50mM NaCl, NTPs or dNTPs at the 
concentrations specified in the text, and MgCl, at a concentration 5 mM in excess of 
the total nucleotide concentration (Supplementary Fig. 13). The helicase monomer 
concentration was adjusted between 1-500 nM for each buffer condition so that the 
average unwinding initiation time (defined as the time between when the DNA was 
initially mechanically unzipped and when the helicase began to unwind) was 
approximately the same for all experiments (Supplementary Fig. 3). 

Experiments were conducted in a climate-controlled room at a temperature of 
23.3 °C, but owing to local laser trap heating the temperature increased slightly to 
25+1°C (ref. 26). Each experiment was conducted in the following steps 
(Supplementary Fig. 1). First, several hundred base pairs of dsDNA were mech- 
anically unzipped, at a constant velocity of 1,400bps ', to produce a ssDNA 
loading region for helicase. Second, after the force dropped owing to helicase 
loading and initiation of unwinding, several hundred more base pairs were mech- 
anically unzipped to generate ssDNA for helicase translocation. Third, the fork 
position was maintained until the force dropped again, indicating that the helicase 
had again reached the junction, at which point the force was allowed to drop to 


8 pN and then maintained at this level as helicase unwound the remaining ~3 kb 
of dsDNA. Measurements of ssDNA translocation rates and dsDNA unwinding 
rates by T7 helicase were thus obtained for each tether. 

Data collection and analysis. Data were low-pass filtered to 5 kHz and digitized at 
12kHz, then were further averaged to 110 Hz. The acquired data signals were 
converted into unwound base pairs as previously described’”*. To improve posi- 
tional accuracy and precision, the data were then aligned to a theoretical unzipping 
curve for the mechanically unzipped section of the DNA”. Slippage events were 
identified by a threshold on the instantaneous unwinding rate at each sequence 
position (Supplementary Fig. 4). We used a threshold of 2,000 bps _' in the reverse 
velocity for identifying slippage. Unwinding rates from each trace were found from 
linear fits to the unwinding between adjacent slippage events. An average unwind- 
ing rate was obtained from a number of traces. Distances travelled between slips 
were compiled to determine processivity. These distances followed an exponential 
distribution, indicating a stochastic process in slippage’®. Processivity is defined as 
the mean distance of the distribution (Supplementary Fig. 4b). 

Modeling. If dTTP and ATP compete for binding to helicase according to the 
kinetic pathway outlined in Fig. 2d, then the resulting unwinding rate is: 


ATP dTTP ATP dTTP 
Viot = ve | Hm) +yar | aa / (1 + | au | 7) , where for each 
Ky Ky Ki Ky 


type of nucleotide Ky = Keith and Vinax = Sky with s being the step size (in 
nucleotides) (see Supplementary Discussion). In the presence of dTTP and 
ATP, if 1 helicase subunits coordinate in their chemo-mechanical activities and 
DNA binding, then the resulting distance between slips (processivity) is: 
; yare [ATP] | arrp [ATP [ATP|/ KA" - 
pone Vatt ar + Vi” ger) [ATP] /KA + [aT TP|/KE™ 

(Supplementary Discussion), with c being a proportionality constant. This 
expression was used to fit data in Fig. 3b, c with c and nas fit parameters. 
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CORRIGENDUM 
doi:10.1038/nature10459 


Detection of prokaryotic mRNA 
signifies microbial viability and 
promotes immunity 


Leif E. Sander, Michael J. Davis, Mark V. Boekschoten, 
Derk Amsen, Christopher C. Dascher, Bernard Ryffel, 
Joel A. Swanson, Michael Miller & J. Magarian Blander 


Nature 474, 385-389 (2011). 


In Fig. 1d of this Letter, the labels HKEC and EC were swapped in the 
print version. The lane labelled HKEC should be labelled EC and the 
lane labelled EC should be labelled HKEC. The error has been 
corrected online in the HTML and PDF versions. 
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BRIGHT LIGHT, BETTER LABELS 


The tiniest structures in cells can be seen only using sophisticated instrumentation and 
informatics, but what biologists really need are improved fluorescent probes. 


Mitochondria in a cell, imaged by conventional microscopy (left), and super-resolution microscopy colour-coded by depth (middle) and in cross-section (right). 


BY MONYA BAKER 


ite biomolecules with fluorescent tags. 

Attaching light-emitting labels to a 
protein can reveal when and where in a cell 
it functions, but usually the details are fuzzy. 
Optical microscopes use light with wave- 
lengths between 350 and 750 nanometres, and 
structures smaller than about 200 nm cannot 
be seen clearly. That is much bigger than the 
thickness of a cell membrane and is about 
half as long as the mitochondria that sup- 
ply cells’ energy. At this scale, many cellular 
secrets are invisible. The protein machinery 


S cientists love to decorate their favour- 


that allows a virus to invade a cell is blurry, 
as are the synapses across which neurons 
communicate. 

The past few years have seen the rise of a 
suite of techniques, collectively known as 
super-resolution microscopy, that can use 
light to reveal structures much smaller than 
the theoretical limit. The trick is to control 
fluorescent labels, or fluorophores, so that not 
all of them signal at once. Light from each indi- 
vidual fluorophore creates a blur, but as long as 
blurs don’t overlap, they can be resolved into 
individual points at their centres. This allows 
the position of the fluorophore to be identified 
precisely, revealing features as small as 20 nm. 
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“The super-resolution that we have developed 
doesnt rely on changing the wave nature of 
light,” says Stefan Hell, director of nanobiopho- 
tonics at the Max Planck Institute for Biophysi- 
cal Chemistry in Géttingen, Germany. “It relies 
on turning dyes on and off” 

Although advances in instrumentation and 
informatics should not be overlooked, many 
researchers believe that it is better-performing 
fluorescent labels that will allow super-resolu- 
tion microscopy to continue to move forward. 
“That’s an area where the field will see the 
biggest advances,” says Jan Liphardt, a bio- 
physicist at the University of California, 
Berkeley. “That’s been limiting all of us.” 
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Electron microscopes can resolve features 
less than a nanometre long — even smaller 
than super-resolution. But electron micros- 
copy requires elaborate preparation of samples: 
usually, cells must be ‘fixed’ with preservatives 
and then embedded in resin or frozen. By con- 
trast, many forms of super-resolution micros- 
copy can be done with live cells. And with fixed 
cells, labels for optical microscopy can iden- 
tify proteins more specifically than can those 
available for electron microscopy. 

Most super-resolution techniques fall into 
two categories. In one, sometimes called illu- 
mination-based super-resolution, precise geo- 
metric patterns of light shine repeatedly across 
a sample to control which fluorophores are 
active. In the other, sometimes called probe- 
based super-resolu- 
tion, conditions are 
tuned so that just 
a few fluorophores 
emit light at a time. 

Whereas illumina- 
tion-based super-res- 
olution microscopy 
requires specialized 
optical equipment, 
probe-based tech- 
niques do not. Exper- 


iments using the “You make a 
latter technique are "IGP of where 
relatively easy to set the molecules 
up (see ‘Starting up were. That’s 
in super-resolution). yourimage.” 
However, only afew Sam Hess 


dozen of the hundreds of extant fluorescent 
proteins and dyes have the requisite properties 
for probe-based super-resolution microscopy: 
the ability to change from one ‘spectral state’ to 
another when exposed to certain wavelengths 
of light (see ‘Fluorescent proteins for super- 
resolution microscopy’). Some remain dark 
until they are activated; others go from one 
colour to another. 


ACRONYM UPROAR 

Three labs independently developed the first 
probe-based techniques in 2006: fluores- 
cence photoactivation localization micro- 
scopy (fPALM) was described’ by Sam Hess, a 
physicist at the University of Maine in Orono; 
photoactivated localization microscopy 
(PALM) was described’ by Eric Betzig and 
Harald Hess, physicists at the Howard Hughes 
Medical Institute’s Janelia Farm Research Cam- 
pus in Ashburn, Virginia; and stochastic opti- 
cal reconstruction microscopy (STORM) was 
described’ by Xiaowei Zhuang, a physicist at 
Harvard University in Cambridge, Massa- 
chusetts. Perhaps one of the most confusing 
aspects of these and other probe-based tech- 
niques is what to call them. Commonly used 
terms include fPALM/STORM and varia- 
tions such as single-molecule localization 
microscopy (SMLM) and single-molecule 
active-control microscopy (SMACM). But 
the underlying concepts behind all the probe- 
based techniques are the same, says Sam Hess. 
“You somehow control the molecules so you 
only have a few visible at a time, you find their 


Starting up in super-resolution 


Interest in super-resolution techniques is 
widespread, but relatively few labs have taken 
the plunge. Here are some tips. 


Optimize conditions first. “Before we go 
through collaboration, we ask that people 

try these labels out first under a regular 
fluorescent microscope,” says Harald Hess, 

a physicist at the Howard Hughes Medical 
Institute’s Janelia Farm Research Campus in 
Ashburn, Virginia. A fluorescent protein that 
works seamlessly with one protein of interest 
could completely disrupt another. “Make sure 
the proteins photoactivate, make sure the cell 
health is okay, make sure that the density’s 
right,” says Hess. When a super-resolution 
experiment doesn’t work as expected, 
researchers can be quick to blame the optical 
equipment. Often the problem is actually 
with the biological label. 


Watch out for artefacts that are no longer 
invisible. A 50-nanometre perturbation 

is invisible in conventional microscopy 
experiments. But in super-resolution, that 


distance can tell you whether two proteins 
cluster together or stay apart. Artefacts that 
researchers could once safely ignore — 
microscope drift, a label’s slight effects on 
localization — must now be considered. 


Get to know your fluorophore. Researchers 
can’t predict from the literature how a 
protein will behave in their hands. Even when 
fused to the same protein, a fluorophore’s 
photostability — the number of photons it 
can give off — can vary from instrument to 
instrument, sample to sample and culture 

to culture, especially under differing oxygen 
levels. “The numbers will be really specific 

to your experimental conditions,” says 
Robert Campbell, a protein engineer at the 
University of Alberta in Edmonton. The order 
of relative photostability of fluorophores 
should remain the same, says Campbell, “but 
| wouldn’t bet my life on it”. 


Consider your lasers. Xiaowei Zhuang, a 
physicist at Harvard University in Cambridge, 
Massachusetts, developed stochastic optical 
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Live-cell super-resolution images showing how 
actin and membrane proteins associate. 


position, you cycle through a whole bunch of 
molecules, and you make a map of where the 
molecules were. That’s your image.” 

These techniques require more control over 
fluorophores than most scientists are used to, 
says Michael Davidson, director of the opti- 
cal microscopy division at the National High 
Magnetic Laboratory in Tallahassee, Florida. 
‘A lot of people are jumping into this. I get the 
question of what probes to use probably ten 
times a week,” he adds. 

Fluorescent proteins are commonly used for 
super-resolution microscopy. The genes that 
code for them, often taken from jellyfish or 
other sea creatures, are fused with the genes 
for the proteins being studied, so that when 


reconstruction microscopy (STORM) using 
a low-powered laser to avoid damaging the 
sample, but it took so long to switch the 
fluorophores that acquiring an image took 
several minutes. With more powerful lasers, 
transitions occur in a millisecond. Michael 
Davidson, director of optical microscopy at 
the National High Magnetic Laboratory in 
Tallahassee, Florida, recommends at least 
100 milliwatts for green wavelengths, and 
up to 200 milliwatts for far-red. The Laser 
Combiner produced by Agilent of Santa 
Clara, California, contains four lasers, putting 
several wavelengths under easy control. 


Don’t overactivate the probes. Setting up a 
probe-based super-resolution experiment 

is easy, but calculating localization spots 
does not guarantee resolution higher than 
that of conventional microscopy. If too many 
probes are activated, localization represents 
not individual molecules, but an average of 
several, explains Zhuang. “Just because you 
get a STORM-like type of image doesn’t mean 
that it has high quality,’ she says. Wi.8. 
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the proteins are produced, they too are joined. 
Thus, when the fluorophore probe lights up, 
it allows researchers to locate the studied 
protein. The most popular protein for probe- 
based super-resolution microscopy is prob- 
ably mEos2 (ref. 4). When first expressed, it 
fluoresces green, but a burst of ultraviolet light 
turns it red. Such ‘photoconvertible’ fluoro- 
phores offer certain advantages over those that 
start out in a dark, non-fluorescent state: they 
allow researchers to image the protein before 
experiments begin, and so more easily pick out 
healthy cells that are producing high levels of 
the labelled protein. What is more, newly pro- 
duced proteins are different colours from those 
that have already been imaged, so researchers 
can follow pools of proteins over time and get a 
sense of their rates of production and destruc- 
tion (see ‘How to build a fluorescent proteir’). 


EVERYTHING IS ILLUMINATED 

Often, though, one fluorophore per experi- 
ment is not enough. “Most of the outstand- 
ing questions that people want nanometric 
accuracy for are in the relationship of two or 
more different proteins relative to each other,’ 
explains Jennifer Lippincott-Schwartz, a cell 
biologist at the National Institutes of Health 
in Bethesda, Maryland, and part of the team 
that invented PALM. “The only way that you 
can address that is using different markers at 
the same time.” 

Gleb Shtengel, a physicist at Janelia Farm, 
says that getting two labels to work together 
inside a cell is difficult, partly because the 
optimal conditions for each are not always the 
same. Fluorophores always prove trickier to 
work with than the imaging apparatus. “You 
have to add another laser, but that’s the sim- 
plest part,’ says Shtengel. Putting the brighter 
label on the less-expressed protein can help to 
make sure that enough data can be collected 
on each of the proteins of interest to fix their 
locations definitively; expression levels must 
also be sufficient and reliable for both proteins. 

And then there are the spectral considera- 
tions. If researchers want to use a second label 
alongside mEos2, for example, they have to find 
one unaffected by both red and green wave- 
lengths of light. A protein described’ this year 


could be a big help: it converts from orange to 
far-red, a much desired colour that is distinct 
from both the natural fluorescence of cells and 
that of other popular fluorophores. “The palette 
is so small right now that any addition is a big 
step forward, especially if you add a colour in 
part of the spectrum that’s empty,’ says Shtengel. 

But researchers are succeeding in using two- 
colour super-resolution microscopy. That has 
allowed them to address questions such as 
whether cell-surface receptors implicated in 
cancer are randomly assorted or are co-local- 
ized on the plasma membrane. Lippincott- 
Schwartz has described®” a general technique 
that allows researchers to quantify how pro- 
teins cluster together on plasma membranes, 
and to assess the size, abundance and density 
of clusters. 

Even illumination-based techniques are 
benefiting from new fluorophores. A technique 
called stimulated emission depletion works by 
pairing lasers: one excites a spot to fluoresce, 
and the other shrinks 
the area of fluor- 


it3 

ts lot of people escence by further 
abe isle Sela 5 exciting fluorophores 
into sup al on its periphery into a 
resolution. Iget special dark state. To 
the question of collect an image, the 
what probes to paired laser beams 
use probably ten scan across the sam- 


times aweek.” ple, repeatedly apply- 
ing intense beams of 
light that force fluorophores into the appropri- 
ate state but can also damage cells. Hell and his 
colleagues last month described’ a fluorescent 
protein that can enable illumination-based 
super-resolution miscroscopy in extremely 
low light levels. Although most fluorescent 
proteins bleach out, or lose their fluorescence, 
with repeated imaging, this new protein can 
be switched on and off more than 1,000 times. 
The researchers were able to image dendritic 
spines (signal-receiving outgrowths on neu- 
rons) at light levels one million times lower 
than had previously been documented, and the 
technique can work with a standard confocal 
microscope, says Hell. 

Lippincott-Schwartz and others are working 
out ways to make conventional fluorophores 


Conventional (a) and super-resolution (b, c) microscopy of microtubules and clathrin protein clusters. 
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amenable to probe-based super-resolution 
microscopy. Instead of lighting up just a few 
molecules at once, they activate an entire popu- 
lation and wait for the fluorescent proteins to 
slowly turn off. The analysis identifies the loss 
of signal, she explains. “As they bleach, mol- 
ecules switch off and leave a hole that can be 
fit to determine where the molecule was.’ Data 
for localizing dark holes are much noisier than 
those for localizing bright points, but the tech- 
nique allows researchers to work with several 
labels at once. In unpublished work, Lippincott- 
Schwartz has been able to visualize as many as 
four fluorophores in a single fixed sample, and 
she thinks that the technique can also be made 
to work in live cells. 


DESIRABLE DYES 

To many cell biologists, the term fluorescent 
label is synonymous with fluorescent protein, 
but there are also small-molecule fluorescent 
dyes. Dyes tend to be more photostable than 
fluorescent proteins, so they can emit an order 
of magnitude more photons, which means 
that dye molecules can be detected and pin- 
pointed more reliably, explains Markus Sauer, 
a biophysicist at the University of Wiirzburg 
in Germany. “The higher photon yield goes in 
hand with higher localization precision and 
thus a higher optical resolution,” he says. 

The speed at which dyes turn on and off is 
also an advantage. In the first demonstration of 
live-cell, three-dimensional STORM, Zhuang 
used six different probes: four dyes and two 
proteins’. One of the dyes, Alexa 647, allowed 
an image to be taken in one second; proteins 
required substantially longer, at 30 seconds per 
image. Collecting more images in less time is a 
practical advantage for all samples, particularly 


Xiaowei Zhuang of Harvard University looks for 
tiny details using fluorescent proteins and dyes. 


for live cells, says Zhuang. “If you cart switch 
the probes fast, you can only image slow 
processes,” she adds. 

The problem is that dyes are often less con- 
venient than fluorescent proteins. Whereas 
researchers can label proteins with fluorescent 
proteins by introducing genes into cultured 
cells, dyes have to be attached in a separate step. 
The most common technique is to combine 
them with antibodies against a protein of inter- 
est. Usually, researchers label ‘secondary anti- 
bodies, which themselves attach to antibodies 
against the protein of interest — a practice that 


How to build a fluorescent protein 


To be useful for super-resolution microscopy, 
a fluorescent protein must have all the 
properties necessary for standard imaging: 

it can’t be toxic; it must label the intended 
target; it must be biologically active at the 
same temperatures as mammalian cells (no 
small feat, given that these proteins generally 
come from sea creatures living in chilly 
waters); it must be bright; and its fluorescence 
must stand out from background. 

Such proteins are often taken from 
jellyfish, anemones or coral, but they all have 
the same general shape: a flat sheet rolled 
up into a ‘B-barrel’, with a helix spiralling 
into the centre. Three amino acids at the end 
of the helix create the chromophore — the 
part responsible for the fluorescence. For a 
protein to be photoactivatable, the parts in 
or around the chromophore must undergo 
reactions catalysed by light. Swapping in 
different amino acids can create new colours 


and set the stage for light-activated reactions 
that change the properties of the protein. 
Even if a protein has the desired colour and 
photoactivity, it may not be bright or well- 
behaved enough to be useful in microscopy, 
so researchers use random mutagenesis to 
hunt for beneficial mutations over the entire 
barrel. Super-resolution imaging makes 
heavy demands on proteins, says Vladislav 
Verkhusha, a structural biologist at Albert 
Einstein College of Medicine in New York 
City, who has made many photoactivatable 
proteins, including the first far-red one’. “If 
you want to have tenfold better resolution, 
you need 100-fold photostability,” he says. 
And for reasons that aren’t entirely clear, 
the way that proteins behave in a group does 
not perfectly represent their behaviour at 
the single-molecule level, explains George 
Patterson, a physicist at the US National 
Institutes of Health in Bethesda, Maryland, 
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allows the same reagents to be used in multiple 
experiments. However, because it is the anti- 
body rather than the protein that is visualized, 
the dyes are somewhat distant from the protein 
of interest. 

Antibodies can usually be used only on fixed 
cells, but they do have advantages. Relevant 
techniques are in common use, and antibod- 
ies work in samples that can’t be transfected, 
such as human biopsies. What is more, the 
target proteins are produced naturally, rather 
than from introduced genes, which can have 
aberrant expression. Last year, Zhuang and 
her colleagues reported”® that they had used 
labelled antibodies with STORM to interrogate 
the locations of ten different proteins within 
synapses, distinguishing which occurred on 
the signal-sending (pre-synaptic) side and 
signal-receiving (post-synaptic) side — some- 
thing that would be impossible in conventional 
microscopy because the synapse is so small. 

There are also ways to use dyes without 
antibodies: ‘soluble ligands; or secreted pro- 
teins that attach to cell surfaces can be pro- 
duced, labelled and then added to cell cultures 
directly. Intracellular proteins can be labelled 
using a ‘hybrid-fusiom approach. Instead of 
being fused to a fluorescent protein, a protein 
of interest is joined to a ‘protein hook that can 
attach to the dye molecules. A variety of tags are 
in use, and the technique can even work with 
commercially available chemical-tag kits made 
for conventional microscopy''. But the dye can 
sometimes attach to biomolecules other than 
the target, says Robert Campbell, a protein 
engineer at the University of Alberta in Edmon- 
ton. “That raises up background fluorescence, 
and that limits the level at which you can see 
the protein” 


who helped to develop the first practical 
photoactivatable fluorescent protein and a 
super-resolution technique’. 

Fluorescent proteins occur naturally as 
bulky tetramers, impractical for labels, so 
research groups have to modify them. One 
group might break the four-barrel proteins 
into individual stable barrels more suitable 
for labelling; another might shift the protein’s 
colour spectrum; another might make it 
photoactivatable; and yet others might make 
more general improvements. 

Researchers just can’t get enough. 

Now more than ever, the better the 
fluorophore, the more biology it can reveal. 
“The fluorophore is at centre stage of the 
whole development,” says Stefan Hell, a 
nanomicroscopist at the Max Planck Institute 
for Biophysical Chemistry in Géttingen, 
Germany. “The fluorophore is decisive. It 
allows you to get the pictures.” WVi.8. 
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Improved analysis will also help scientists 
to get more from their labels. Researchers led 
by Sam Hess showed” that three fluorophores 
that all emit in the orange-red wavelengths 
could be distinguished from each other. Con- 
ventional microscopy would not be able to 
separate, say, a greenish-yellow label that emits 
two green photons for each red one from an 
orangish-yellow label that emits one green 
photon for each red one, but super-resolu- 
tion microscopy can distinguish such signals 
because emitted photons are attributed to indi- 
vidual proteins. In such a case, “it’s okay to have 
the emission spectrum overlap because you are 
imaging individual molecules’, says Hess. His 
team was able to use three labels with over- 
lapping spectra to simultaneously image two 
membrane proteins and a cytoskeleton pro- 
tein, showing how these different components 
of the cell interact. 


BETTER RESOLUTION THROUGH COMPUTATION 
Better analysis and more sophisticated algo- 
rithms should also help researchers who are 
using only one label at a time. To speed imag- 
ing, researchers would like to increase the 
number of fluoropohores that emit light at any 
given time. But if too many fluorophores emit 
too close together, their signals overlap and 
cannot be resolved into individual points. Sev- 
eral groups are working on software that lets 
scientists image more labels in smaller spaces. 
For example, researchers at the University of 
Oxford, UK, adapted” an algorithm originally 
developed to study crowded star systems, 
and used it in probe-based super-resolution 
microscopy. They showed that it could detect 
more fluorophores than could two imaging 
algorithms commonly used in microscopy. 
Aleksandra Radenovic, a biophysicist at the 
Swiss Federal Institute of Technology in Laus- 
anne, has designed computational approaches 
to mitigate artefacts caused when ‘bleached’ 
proteins, which have supposedly lost their 
fluorescence permanently, revert to a state in 
which they can be activated“. The effort grew 
out of another project, exploring dense protein 
clusters on the cell membrane. After a protein 
fragment chosen as a negative control dis- 
played unexpectedly high levels of clustering, 


Super-resolution imaging reveals the molecular architecture that enables cellular adhesion. 


Radenovic and her co-workers studied the 
activation times of individual molecules of 
mEos2. The data showed that signals from 
similar locations clustered together in time. 
The sequence of signalling molecules should 
be random across a sample, so these results 
indicated that the same protein was signalling 
more than once and was being misinterpreted 
as multiple proteins, explains Radenovic. “Just 
looking at the time domain, you can get rid of 
those artefacts,” she says. 

Although probe-based super-resolution 
microscopy can be done using standard fluo- 
rescence microscopes, several manufacturers 
offer systems built specifically for this purpose, 
along with software for analysing the data. 
Such microscopes are designed to optimize 
the activation of probes. Licensing agree- 
ments restrict which acronyms each manu- 
facturer uses in marketing, but a machine that 
works for one form of probe-based micros- 
copy generally works for other forms as well. 
Tokyo-based company Nikon has installed its 
system in dozens of labs; Leica Microsystems 
of Wetzlar and Zeiss of Oberkochen, both in 
Germany, have also introduced systems. And 
small start-up companies, such as Vutara in 


FLUORESCENT PROTEINS FOR SUPER-RESOLUTION MICROSCOPY 


Brightest photoactivatable proteins described so far. mEos2 
behaves well when fused with target proteins. (Dendra2 and 
KikGR also convert from green to red, but not as well) 


Can be used alongside red and orange proteins 


First photoconvertible protein in the advantageous far-red part of 
Along with PATag RFP, can be used in combination with dark-to- 
Behaves well when fused with target proteins. First 


photoactivatable protein, but not as bright as others 


The ability to switch from dark to light and back to dark allows 


Protein Spectral states 
mEos2; Green to red 
tandem 
dimer Eos 
PS-CFP2 Cyan to green 
PSmOrange Orange to far-red 
spectrum 
PAmCherry Dark to orange 
green proteins 
PA-GFP Dark to green 
Dronpa Dark to green to dark 
(reversible) 


tracking and live-cell applications 


142 | NATURE | VOL 478 | 6 OCTOBER 2011 


© 2011 Macmillan Publishers Limited. All rights reserved 


Salt Lake City, Utah, are getting into the mar- 
ket as well. Applied Precision of Issaquah, 
Washington (acquired in April by GE Health- 
care of Fairfield, Connecticut), plans to roll 
out its probe-based super-resolution system, 
Monet, later this year. Most of these compa- 
nies also make instruments for illumination- 
based microscopy, which require specialized 
components. 

With or without dedicated instruments, 
researchers are keen to try their hand at super- 
resolution microscopy. So far, most papers 
demonstrate proof of principle for microscope 
methods rather than fundamental new biology 
uncovered by the techniques, but the balance 
is shifting, says Davidson. “It’s going to be an 
explosive field. It's just now raising its head, 
and it’s about to take off like a bat out of hell.” = 


Monya Baker is technology editor for Nature 
and Nature Methods. 
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CAREERS 


Every scientist needs compassion 
and support in the lab p.145 


The latest discussions and 
news on research jobs go.nature.com/z8g4a7 


For the latest career 
listings and advice www.naturejobs.com 


EDUCATION 


Inspiration for 
informatics 


Trainees in bioinformatics and computational biology 
should seek depth of knowledge over breadth. 


BY VIRGINIA GEWIN 


his January, Alexander Sczyrba and 
| his colleagues published what was at 
the time the largest metagenome ever 
assembled (M. Hess et al. Science 331, 463-467; 
2011). Collecting and collating genetic material 
from environmental samples is always a chal- 
lenge; in this case, the metagenome came from 
parts of a cow’s stomach, and contained more 
than 27,000 biomass-degrading genes and 
15 microbe genomes. It totalled 268 gigabases. 
“We had to develop new algorithms to run 
analyses on computer clusters, or clouds, as 
using traditional methods would have taken 
80 years on a single computer,’ says Sczyrba. 
Sczyrba wants to focus his career on similar 
complex, leading-edge analyses. But the path 
hasn't been straightforward; when he was 
looking for a postdoc in 2008, it was tough to 
find institutions that could generate or analyse 
such large data sets. He landed a post at the US 
Department of Energy Joint Genome Institute 
(JGI) in Walnut Creek, California: a large-scale 
sequencing facility that offered access to data, 
computing resources and brain power. In 2010 
alone, the JGI sequenced 170 metagenomes. 
Soon, however, big sequencing centres won't 
be the only sources of data. “With next-gen- 
eration sequencing, everybody can produce 
sequences; it’s the analysis that is getting more 
important,” says Sczyrba. Modern biologists 
need to be able to manage large data sets and 
explore new computational tools. 


FINDING A PATH 

Qualified candidates are hard to find, say 
recruiters in both industry and academia. 
That may be because, so far, there hasn't been 
a typical career path for bioinformaticians 
or computational biologists. “Often we find 
that it’s the people motivated to simply roll up 
their sleeves and figure out on their own how 
to work with these data that have the strongest 
skills” says Jim Bristow, deputy director of pro- 
grammes at the JGI. As more departments are 
established, the often circuitous routes once 
required to attain such skills will probably be 
replaced by more direct paths. The challenge 
is finding a training programme that will help 
researchers to keep pace ina rapidly changing, 
technology-driven field. 

By conventional definitions, bioinformati- 
cians develop new ways to acquire, organize 
and analyse biological data, whereas computa- 
tional biologists develop mathematical models 
or simulation techniques to work out the | 
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> data’s biological significance. But these lines 
are blurring, and departments and training 
programmes are both proliferating and com- 
bining the fields. 

“The demand for computational-biology 
training that we have today is way more than 
was expected a decade ago,” says Burkhard 
Rost, president of the International Society 
for Computational Biology, which is based in 
La Jolla, California. 


NOT JUST SKIN DEEP 

The most obvious training route — pursuing 
an undergraduate degree in bioinformatics 
— isn't necessarily the best for a budding 
researcher. Some undergraduate programmes 
fail to provide the depth of knowledge sought 
by employers. “Often these trainees come with 
great-looking CVs, but when we press them on 
what they are capable of doing, they tend to 
be rather weak,” says Nick Goldman, research 
and training coordinator at the European Bio- 
informatics Institute in Hinxton, UK. Gold- 
man is most impressed by applicants who have 
actively pursued training in both informatics 
and the area of research in which they’re inter- 
ested — for example, someone with a comput- 
ing degree who has done a molecular-biology 
project (see “Talent checklist’). 

Goldman says that students should be wary 
of learning about only the latest software 
or genome-mining tool, without gaining a 
full understanding of the biological topics. 
Recruiters want savvy scientists who under- 
stand technology's ability to address questions. 
Steve Cleaver, head of quantitative biology at 
Novartis Institutes for BioMedical Research in 
Cambridge, Massachusetts, says that the key 
to a sustainable career in the field is the ability 
to turn a scientific question into a statistical 
hypothesis. “But those who can ride the tech 
waves are well positioned to find career suc- 
cess,’ he adds. Without a doubt, he adds, the 
next generation of biologists will be more con- 
versant in bioinformatics. “It’s all about cross- 
training — getting the appropriate training 
in both analytical science and biology during 
graduate school to make a meaningful contri- 
bution,” says Cleaver. 

Picking a programme with comprehensive 
training modules in statistics, computer sci- 
ence and/or biology can be an effective strategy. 
But Soren Brunak, director of the Center for 
Biological Sequence Analysis at the Technical 
University of Denmark in Lyngby, says that 
researchers should avoid training programmes 
that focus on just a few data types. With the 
expansion in high-throughput sequencing 
of genomes, proteins and metabolites, pro- 
grammes that focus on a single area, such as 
genomics, don’t adequately prepare students 
for the job market, says Brunak. “Analyses con- 
ducted now are much more reliant on combina- 
tions of data types — for example, combining 
molecular-level data with patient records — 
than they were before,’ he notes. 


Aspiring principal investigators can go one 
step further to find the best graduate training 
for the career they want, by deciding whether 
to focus on developing tools, such as algorithms 
to analyse data, or applying those tools to turn 
data into knowledge. 
“The most important 
decision a trainee can 
make is what kind of 
research programme 
they want to build,” 
says Robert Murphy, 
founding director of 
a computational-biol- 
ogy PhD programme 
run jointly between 
the University of 


“We don’t know Pittsburgh in Penn- 
wher ewe'll sylvania and Carnegie 
be inten years Mellon University, 
because the also in Pittsburgh. 

technologiesand _The University of 
ideas aremoving California, Los Ange- 
so fast.” les (UCLA), has a 
Alexander Sczyrba bioinformatics PhD 


programme designed 
to shape the tool developers. It accepts only 
candidates who demonstrate a core strength 
in an analytical field such as computer science 
or maths, or have a dual degree combining one 
of these fields with biology. Christopher Lee, 
director of the programme, says that many bio- 
informatics courses are affiliated with data-rich 
biology labs on campus, supplying the students 
needed to tackle a flood of data. They often lack, 
however, the matrix of expertise necessary to 
conduct innovative analyses. Lee hopes that the 
UCLA programme will foster such expertise. 
A few graduate training programmes, nota- 
bly those at the Netherlands Bioinformatics 


BASIC SKILLS 
Talent checklist 


@ Beat least conversant in the broad 
range of disciplines contributing to 
bioinformatics — from statistics to 
molecular biology to computer science. 


@ Most work, especially in industry, is 
done in teams, so communication skills 
are always in demand. 


@ Get experience in handling massive 
data sets. Learn to parse data or run 
analyses in parallel — using, for example, 
cloud computing. 


@ Learn to write programmes in software 
languages such as Perl or R. 


© Cultivate a deep knowledge of at least 
one area of biology. V.G. 
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Center in Nijmegen, cater to students with 
backgrounds in either computer science or 
biology. “We want to train the tool shapers as 
well as the people more into applying the tools 
in a biological setting,” says Celia van Gelder, 
the centre’s education project leader. “Over 
the past 10-20 years, the field of biology has 
become more computational, with bioinfor- 
matics serving as an interdisciplinary field that 
links researchers who can't otherwise readily 
talk to one another.” The scope of work is wid- 
ening, she says. As a result, demand for bioin- 
formatics training continues to increase across 
Europe — with greater emphasis placed on data 
analysis at all levels. “We produce trainees who 
have multidisciplinary training in molecular- 
biology principles as well as algorithms to deal 
with data,” says Jaap Heringa, the centre's sci- 
entific director for bioinformatics education. 
“Things move so fast in bioinformatics, we are 
constantly innovating our courses,’ he adds. 
Murphy agrees; Carnegie Mellon and the Uni- 
versity of Pittsburgh offer in-depth training. 
“We are pretty clear in the application materials 
that our programme is not for people who want 
to get enough ofa smattering of computational 
biology to get ajob,’ says Murphy. 


EXPANDING OPTIONS 

This trend towards creating more comprehen- 
sive, interdisciplinary training programmes 
has gained momentum at biology strongholds 
in the United States. In July 2010, Dartmouth 
Medical School in Hanover, New Hampshire, 
established the Institute for Quantitative 
Biological Sciences in nearby Lebanon. Its 
graduate offerings combine modules in bio- 
informatics, biostatistics and epidemiology. 
“We have created what we think is a model of 
the future — training computational-biology 
students to speak multiple languages beyond 
bioinformatics,” says the centre’s director, 
Jason Moore. He adds that the key is assum- 
ing complexity rather than simplicity when 
approaching a problem. 

In August, Moore secured funding to cre- 
ate a US National Institutes of Health (NIH) 
Center for Biomedical Research Excellence, 
through which he will mentor five early- 
career bioinformatics faculty members, to be 
recruited over the next 3-4 years. After two 
years of learning how to secure competitive 
funding, among other things, trainees will be 
required to submit an application for an RO1 
grant, the NIH’s main funding mechanism. 
“We really want to provide a well rounded 
education so that our new recruits can secure 
funding for — and conduct — well designed 
studies in computational biology,’ says Moore. 

Other medical schools are also taking the 
plunge. Duke University School of Medicine in 
Durham, North Carolina, formed its Depart- 
ment of Biostatistics and Bioinformatics in 
2000. This year, it opens its first master’s pro- 
gramme, says Elizabeth Delong, chair of the 
department. 


D. E. GILBERT 


M. VAN ZWAM 


And in September, the University of 
Michigan Medical School in Ann Arbor 
established a computational-medicine and 
bioinformatics department to help attract 
new faculty members and trainees. In June, 
Emory University School of Medicine in 
Atlanta, Georgia, launched a biomedical- 
informatics department with the goal of 
combining expertise in imaging, computer 
science and biology to improve patient care. 
It will recruit four or five researchers over 
the next few years. “Our particular strength 
is training computer scientists who want 
to transition into biomedical informatics, 
and bringing them together with clinicians 
to use informatics to treat disease,’ says 
department chair Joel Saltz. 

Qualified postdocs remain in demand. 
“Tt can be very difficult for individual inves- 
tigators to hire a postdoc in bioinformat- 
ics, says Tom Tullius, interim chair of the 
bioinformatics programme at Boston Uni- 
versity in Massachusetts. He attributes the 
paucity of candidates in part to efforts over 
the past several years to build large teams 
at high-powered institutes — such as the 
Broad Institute in Cambridge, Massachu- 
setts, or the Wellcome Trust Sanger Institute 
in Cambridge, UK 
— leaving smaller 
labs struggling to 
find talent. The 
growth of train- 
ing programmes 
could ease this. 

Now sequenc- 
ing centres 
won't be the sole 
providers of 
data, individual 


“We want to researchers, par- 
train the tool ticularly at medi- 
shapers as well cal centres, will 
as the people have ample data to 
more into fuel research and 
applying the training. “We've 
tools.” passed out of the 


Celia van Gelder period of genome 
projects where 
there were amazing public data raining 
down from the heavens; it’s now possible 
to do exciting work without being associ- 
ated with data-generating centres,’ says Lee. 

Sczyrba, who begins a junior faculty 
position in metagenomics at the University 
of Bielefeld Center for Biotechnology in 
Germany this autumn, says that unpredict- 
ability is what makes the discipline so excit- 
ing. “We don't know where we will be in ten 
years because the technologies and ideas are 
moving so fast;’ he says. As Cleaver notes: 
“Perhaps the best career strategy is to stay 
flexible and curious.” m 


Virginia Gewin is a freelance writer in 
Portland, Oregon. 


COLUMN 


The human touch 


A little empathy goes a long way in the competitive 
confines of a laboratory, argues Lydia Soraya Murray. 


s almost every scientist knows, a 
Ar first year in research is an 
emotional minefield. One minute 
youre flying high. The next, you're banging 
your head against the wall, resisting the urge 
to draw in results with a marker pen. Forget the 
F-word; in science, it’s the O-word that gener- 
ates dread. I sometimes think that ‘optimizing’ 
should be spelled ‘re.p.e.a.te.d failures. 
After a stimulating yet often soul-destroying 
start to my PhD, I have decided that coming to 
terms with the lows is one of the most impor- 
tant things that you can take away from your 
first year. Never mind the dreaded literature 
review; this is unquestionably more important. 
Because scientists do incredibly special- 
ized and often misunderstood work, it can be 
hard for people outside our particular fields to 
empathize with our attachment to our projects. 
I have a close friend who is a physician. After 
several weeks of my hard work culminated in 
what can be described only as ‘diddly squat, 
my friend offered these consoling words: “It’s 
not as if someone has died”. To this day, I don’t 
think he realizes how close he came to getting 
stabbed in the eye with a pipette. Instead of 
taking bloody revenge, I pointed out that if 
researchers didn't care so much, he would still 
be treating head colds with leeches — a less 
satisfying but more legal response. 
Unfortunately, voicing frustrations to col- 
leagues can be just as futile, and prompt the 
short and not so sweet response: “That's sci- 
ence”. To be sure, cultivating a career in the 
frighteningly competitive world of research 
leaves no room for hand-holding or molly- 
coddling, and I truly believe that principal 
investigators need full-body elephant-hide 
transplants to achieve the thick skin required 
for the job. However, we are all human and 


everyone needs some sort of coping mecha- 
nism. Losing this mechanism is always a 
disaster. 

In truth, there is no magic answer for how 
to deal with a disappointment rate of 90%. 
Some people build up walls to protect them- 
selves, but this can result in suppression of all 
emotion. And let’s be honest: given the hours 
that scientists work and the wages we earn, it 
is mostly our passion that keeps us chained 
to the lab bench. Dulling the rare moments 
of true toe-tingling excitement when things 
work and we discover something for the first 
time would be far too big a sacrifice. But others 
might have quite a different attitude and feel 
the disappointment so acutely that it destroys 
their confidence and paralyses them. Channel- 
ling your emotion into something manageable 
is truly important. I suggest that anyone new 
to research should find healthy ways to deal 
with their frustrations. Some people read or 
play a sport; others go out dancing. My cop- 
ing mechanism is a large glass of red wine and 
a fantastic group of friends who put up with 
my rants, then shut me up with a good dose of 
perspective and insight. 

Humansare social animals, and sometimes 
solitude enhances the feeling of ineptitude 
and makes dealing with a disappointment 
even harder. Maybe next time someone wan- 
ders past you gazing forlornly at their lab book 
with that oh-so-familiar look of puzzlement 
and frustration, a wee pat on the back and a bit 
of camaraderie might help. Yes, ‘that’s science’ 
But perhaps they'll see that success is possible 
despite repeated failures. 


Lydia Soraya Murray is a PhD student in 
molecular genetics and cell biology at the 
University of Glasgow, UK. 
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And in September, the University of 
Michigan Medical School in Ann Arbor 
established a computational-medicine and 
bioinformatics department to help attract 
new faculty members and trainees. In June, 
Emory University School of Medicine in 
Atlanta, Georgia, launched a biomedical- 
informatics department with the goal of 
combining expertise in imaging, computer 
science and biology to improve patient care. 
It will recruit four or five researchers over 
the next few years. “Our particular strength 
is training computer scientists who want 
to transition into biomedical informatics, 
and bringing them together with clinicians 
to use informatics to treat disease,’ says 
department chair Joel Saltz. 

Qualified postdocs remain in demand. 
“Tt can be very difficult for individual inves- 
tigators to hire a postdoc in bioinformat- 
ics, says Tom Tullius, interim chair of the 
bioinformatics programme at Boston Uni- 
versity in Massachusetts. He attributes the 
paucity of candidates in part to efforts over 
the past several years to build large teams 
at high-powered institutes — such as the 
Broad Institute in Cambridge, Massachu- 
setts, or the Wellcome Trust Sanger Institute 
in Cambridge, UK 
— leaving smaller 
labs struggling to 
find talent. The 
growth of train- 
ing programmes 
could ease this. 

Now sequenc- 
ing centres 
won't be the sole 
providers of 
data, individual 


“We want to researchers, par- 
train the tool ticularly at medi- 
shapers as well cal centres, will 
as the people have ample data to 
more into fuel research and 
applying the training. “We've 
tools.” passed out of the 


Celia van Gelder period of genome 
projects where 
there were amazing public data raining 
down from the heavens; it’s now possible 
to do exciting work without being associ- 
ated with data-generating centres,’ says Lee. 

Sczyrba, who begins a junior faculty 
position in metagenomics at the University 
of Bielefeld Center for Biotechnology in 
Germany this autumn, says that unpredict- 
ability is what makes the discipline so excit- 
ing. “We don't know where we will be in ten 
years because the technologies and ideas are 
moving so fast;’ he says. As Cleaver notes: 
“Perhaps the best career strategy is to stay 
flexible and curious.” m 


Virginia Gewin is a freelance writer in 
Portland, Oregon. 


COLUMN 


The human touch 


A little empathy goes a long way in the competitive 
confines of a laboratory, argues Lydia Soraya Murray. 


s almost every scientist knows, a 
Ar first year in research is an 
emotional minefield. One minute 
youre flying high. The next, you're banging 
your head against the wall, resisting the urge 
to draw in results with a marker pen. Forget the 
F-word; in science, it’s the O-word that gener- 
ates dread. I sometimes think that ‘optimizing’ 
should be spelled ‘re.p.e.a.te.d failures. 
After a stimulating yet often soul-destroying 
start to my PhD, I have decided that coming to 
terms with the lows is one of the most impor- 
tant things that you can take away from your 
first year. Never mind the dreaded literature 
review; this is unquestionably more important. 
Because scientists do incredibly special- 
ized and often misunderstood work, it can be 
hard for people outside our particular fields to 
empathize with our attachment to our projects. 
I have a close friend who is a physician. After 
several weeks of my hard work culminated in 
what can be described only as ‘diddly squat, 
my friend offered these consoling words: “It’s 
not as if someone has died”. To this day, I don’t 
think he realizes how close he came to getting 
stabbed in the eye with a pipette. Instead of 
taking bloody revenge, I pointed out that if 
researchers didn't care so much, he would still 
be treating head colds with leeches — a less 
satisfying but more legal response. 
Unfortunately, voicing frustrations to col- 
leagues can be just as futile, and prompt the 
short and not so sweet response: “That's sci- 
ence”. To be sure, cultivating a career in the 
frighteningly competitive world of research 
leaves no room for hand-holding or molly- 
coddling, and I truly believe that principal 
investigators need full-body elephant-hide 
transplants to achieve the thick skin required 
for the job. However, we are all human and 


everyone needs some sort of coping mecha- 
nism. Losing this mechanism is always a 
disaster. 

In truth, there is no magic answer for how 
to deal with a disappointment rate of 90%. 
Some people build up walls to protect them- 
selves, but this can result in suppression of all 
emotion. And let’s be honest: given the hours 
that scientists work and the wages we earn, it 
is mostly our passion that keeps us chained 
to the lab bench. Dulling the rare moments 
of true toe-tingling excitement when things 
work and we discover something for the first 
time would be far too big a sacrifice. But others 
might have quite a different attitude and feel 
the disappointment so acutely that it destroys 
their confidence and paralyses them. Channel- 
ling your emotion into something manageable 
is truly important. I suggest that anyone new 
to research should find healthy ways to deal 
with their frustrations. Some people read or 
play a sport; others go out dancing. My cop- 
ing mechanism is a large glass of red wine and 
a fantastic group of friends who put up with 
my rants, then shut me up with a good dose of 
perspective and insight. 

Humansare social animals, and sometimes 
solitude enhances the feeling of ineptitude 
and makes dealing with a disappointment 
even harder. Maybe next time someone wan- 
ders past you gazing forlornly at their lab book 
with that oh-so-familiar look of puzzlement 
and frustration, a wee pat on the back and a bit 
of camaraderie might help. Yes, ‘that’s science’ 
But perhaps they'll see that success is possible 
despite repeated failures. 


Lydia Soraya Murray is a PhD student in 
molecular genetics and cell biology at the 
University of Glasgow, UK. 
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BEAU LARK/CORBIS 


HERE BE MONSTERS 


BY STEPHANIE ZVAN 


aree stuck her head into the window- 
Ke lab. “Doc?” Doctor Andrews 

stared at her screen, chewing a 
twisted strand of hair. 


“Uh, Doc?” 
“Oh!” Andrews started. “Is it time 
already?” 


“Actually, ’'m running late” 

“Right.” Doc Andrews stood up but kept 
watching the screen. 

“Doc...” 

“Sorry, sorry. She turned. “It’s just that I 
might finally be there. If the lab in Sweden 
has duplicated my results, we may finally 
have a cure, even for the worst cases.” 

“Really, Doc? That’s wonderful!” Karee’s 
voice rang in the concrete hallway. “No more 
muco-whatsis?” 

“No more MPS.” Andrews laughed, a 
sound of pure joy. “No more sick babies. No 
more stunted bodies and minds. Just healthy 
children, beautiful and sound.” She licked 
her lower lip. 

Despite her best effort, Karee twitched a 
little. She scratched her shoulder to cover it. 
“That’s great, Doc. Great.” It was wonderful 
news, but... They arrived at the doctor's 
room, none too soon. “Here we are. Have 
you eaten?” 

She often forgot. 

“While I was waiting”” Andrews grabbed 
what looked like long underwear off the back 
of a chair and headed into the bathroom. A 
few minutes and some running water later, 
she was changed and back. She sat on her 
bed. “All set.” 

The helmet always looked uncomfort- 
able to Karee, bulky and claustrophobic, 
and the relish with which Andrews put it on 
didn't make Karee any happier. She waited 
for Andrews to settle into her special pillow. 
A light on the helmet indicated everything 
was synching properly. Small power and 
data cables clipped onto the pyjamas. Gloves 
attached to the sleeves completed the outfit. 

“You good, Doc?” 

“Oh, yes” Andrews gave a little wriggle of 
anticipation. 

Karee swallowed. No getting used to that. 
“Good night, then.” 

With the last of the inmates ‘shelved and 
synched; she signed out using her passkey. 
Her ward was quiet, with the exception of 
the occasional low moan. Time to leave her 
charges to the night staff and rejoin society. 

There was no good reason to wash her 


Virtually free. 


hands at the end of her shift, but she always 
did. Her face too. The cool evening breeze 
found the spots around her ears she hadn't 
quite dried. She shivered but felt much 
lighter than she had inside. 

Then she saw the sign through the fence. 
Poster paper ona broom handle, it said sim- 
ply: Here be monsters. It must be Thursday. 

Theo was alone, as he always was these 
days. The year before, his wife, Hannah, had 
waded into the lake and swum out farther 
than she could swim back. 

No one at the facility ever spoke to him, 
but everyone knew his story. Everyone knew 
about Hannah, and everyone knew about 
their son. 

Claude was born the year the facility 
opened. Theo and Hannah werent protest- 
ing then, but plenty of others were. Sure, the 
paedophiles and compulsive sadists should 
be locked up, but using VR to give them what 
they wanted? Victimless as it was, it still felt 
wrong. 

Karee understood. She did. But study 
after study had showed that no treatment 
was effective enough in changing inher- 
ently antisocial sexual orientations and that 
the stigma surrounding them only made 
people more likely to offend. Involuntary 
commitment after the fact couldn't help the 
victims. Voluntary commitment with the 

VRas incentive worked 


> NATURE.COM despite its unpopularity. 
Follow Futures on The video inter- 
Facebook at: views had at least made 
go.nature.com/mtoodm it acceptable. Doctor 
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Andrews had been one of the first inter- > 
viewed, and Karee had transferred from 3 
maximum security to guarding voluntary 
commitment after seeing the look of relief 
on Andrews’ face. Here Karee could make 
a difference. 

The difference hadn't come in time for 
Claude. When the neighbour who raped and 
murdered him was found to be one of those 
organizing the local demonstrations, most of 
the remaining protests stopped. More people 
volunteered to be locked up. Claude's death 
saved uncounted children, but it destroyed 
his parents. And when their neighbour was 
murdered in prison two years later, Claude’s 
family was left without any target for their 
anger except Karee’s facility. 

So, here it was, Thursday again, and there 
stood Theo with his sign. He hefted it a little 
higher when Karee was buzzed through the 
gate, but he didn't look at her. He never had, 
not once in the past five years. 

Karee sighed, suddenly tired. Doc Andrews 
and the others were rewarded for making the 
world a safer place. They were happy. Why 
wasn't she? The kids were taken care of, and 
so were the ... well, the monsters whod agreed 
not to threaten them. Everybody was taken 
care of, in fact, except her and... 

Karee took a deep breath. “Hey, Theo?” 

Startled, Theo looked at her for once. Karee 
had dreaded seeing anger or hate on his face, 
but the blankness there disturbed her more. 
It said hed forgotten what his protest was sup- 
posed to accomplish. He looked terribly old. 

Karee nodded up the street. “You've got 
to be just about done here, right? Want to go 
get a cup of coffee?” Coffee seemed like such 
a simple, uncomplicated good thing, what 
she wanted more than anything else in the 
world right now. 

The blankness stared back at her. Then 
Theo opened his mouth. It worked for a bit, 
no sound coming out, but confusion was an 
improvement. Karee realized they must be 
about the same age. 

“Come on? Karee motioned with her hand. 
“Coffee shop’s just on the corner. I'll buy.” 

Finally, Theo rested his sign against the 
fence and nodded. His smile was tentative, 
but it was a smile. As they walked away, 
Karee felt herself grinning for the first time 
in years. m 


Stephanie Zvan is a Minneapolis writer 
fascinated with the strains that science and 
society place on one another. She blogs on 
that and more at Almost Diamonds. 
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S-nitrosylation of NADPH oxidase regulates cell 


death in plant immunity 
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Changes in redox status are a conspicuous feature of immune 
responses in a variety of eukaryotes'”, but the associated signalling 
mechanisms are not well understood. In plants, attempted micro- 
bial infection triggers the rapid synthesis of nitric oxide** and a 
parallel accumulation of reactive oxygen intermediates, the latter 
generated by NADPH oxidases related to those responsible for the 
pathogen-activated respiratory burst in phagocytes’. Both nitric 
oxide and reactive oxygen intermediates have been implicated in 
controlling the hypersensitive response, a programmed execution 
of plant cells at sites of attempted infection***. However, the 
molecular mechanisms that underpin their function and coord- 
inate their synthesis are unknown. Here we show genetic evidence 
that increases in cysteine thiols modified using nitric oxide, termed 
S-nitrosothiols, facilitate the hypersensitive response in the 
absence of the cell death agonist salicylic acid and the synthesis 
of reactive oxygen intermediates. Surprisingly, when concentra- 
tions of S-nitrosothiols were high, nitric oxide function also 
governed a negative feedback loop limiting the hypersensitive res- 
ponse, mediated by S-nitrosylation of the NADPH oxidase, 
AtRBOHD, at Cys 890, abolishing its ability to synthesize reactive 
oxygen intermediates. Accordingly, mutation of Cys 890 compro- 
mised S-nitrosothiol-mediated control of ALRBOHD activity, per- 
turbing the magnitude of cell death development. This cysteine is 
evolutionarily conserved and specifically S-nitrosylated in both 


human and fly NADPH oxidase, suggesting that this mechanism 
may govern immune responses in both plants and animals. 
Complex plants do not possess a nitric oxide synthase structurally 
related to those found in animals; nevertheless, a number of potential 
sources for pathogen-triggered nitric oxide synthesis have been 
described, including nitrate reductase and an arginine-dependent 
nitric-oxide-synthase-like activity’. S-nitrosylation, the addition of a 
nitric oxide moiety to a reactive cysteine thiol to form an S- 
nitrosothiol® (SNO), is an important route for nitric oxide bioactivity. 
In Arabidopsis, an S-nitrosoglutathione (GSNO) reductase (AtGSNOR1) 
governs both the concentrations of GSNO and, indirectly, protein 
SNOs*. We determined the temporal profile of SNO concentrations 
during the development of hypersensitive response in atgsnor1-3 and 
atgsnor1-1 plants, in which AtGSNOR1 activity is absent or increased, 
respectively*. Thus, SNO concentrations were anticipated to be higher 
in atgsnor1-3 and lower in atgsnor1-1 plants, relative to wild type. 
SNO concentrations were also determined in the NO overproducing 1 
(nox1) mutant’. Such plants were challenged with the bacterial patho- 
gen Pseudomonas syringae pv. tomato (Pst) DC3000 expressing either 
AvrB or AvrRps4 effector proteins, which are recognized by the resist- 
ance (R) gene products RPM] and RPS4, respectively, each a prototypic 
member of a distinct R protein subclass'®"’. In each case, SNO concen- 
trations increased over time in all the Arabidopsis lines tested (Fig. 1a, b), 
relative to Pst DC3000 controls (Supplementary Fig. 1). However, 
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Figure 1 | SNOs positively regulate cell death by hypersensitive response. 
a, Profile of SNO accumulation following challenge with Pst DC3000 (avrB). 
b, SNO accumulation following attempted Pst DC3000 (avrRps4) infection. 
c, Total salicylic acid (SA) accumulation in response to attempted Pst DC3000 
(avrB) colonization. d, Accumulation of salicylic acid following Pst DC3000 
(avrRps4) challenge. e, Cell death development in the given Arabidopsis 
genotypes, triggered by 5 X 10°c.fu. ml? Pst DC3000 (avrB) at 15h.p.i. and 
scored by trypan blue staining. c.f.u., colony-forming unit; h.p.i., hours post- 
inoculation. f, Magnitude of cell death development in the stated plant lines 
following challenge with 5 X 10°c.f.u.ml-' Pst DC3000 (avrRps4) at 15h.p.i, 


determined by trypan blue staining. g, Cell death development in the given 
Arabidopsis genotypes, triggered by 5 X 10°c.f.u.ml |! Pst DC3000 at 15h.p.i. 
and scored by trypan blue staining. h, Extent of cell death development 
established by electrolyte leakage in the given Arabidopsis genotypes following 
challenge with Pst DC3000 (avrB) at 1 x 10°c.f.u. ml !. i, Quantification of cell 
death by electrolyte leakage in the stated plant lines in response to Pst DC3000 
(avrRps4). j, Growth of H. arabidopsidis Emwa1 at 10 d post-inoculation in the 
given Arabidopsis genotypes. Data points represent mean + s.e. (n = 3). Unless 
stated otherwise, avirulent strains of Pst DC3000 were infiltrated at 

1X 10°c.fu.ml-’, 
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concentrations of these molecules were higher in atgsnor1-3 and nox1 
plants than in wild type. Similar results were obtained when we used 
reporters of NO accumulation”? (Supplementary Fig. 2a—d). 

We also determined the profile of salicylic acid accumulation 
(salicylic acid is a cell death agonist'*) in these lines. Total salicylic acid 
accumulation was diminished in atgsnor1-3 and nox] plants relative to 
wild type (Fig. 1c, d and Supplementary Fig. 3a), as was free salicylic 
acid and salicylic acid B-glucoside (Supplementary Fig. 3b-e). 
Together, these results suggest that atgsnor1-3 and nox! plants accrue 
markedly more SNOs over time during the development of hyper- 
sensitive response, and that the atgsnor1-1 line accumulates signifi- 
cantly fewer. Further, salicylic acid concentrations are diminished in 
atgsnor1-3 and nox] plants. 

Next we assessed the development of hypersensitive response in 
these plants. Challenge with Pst DC3000 expressing either avrB or 
avrRps4 revealed that this defence response was delayed in atgsnor1-1 
plants relative to wild type. In contrast, the development of hyper- 
sensitive response in atgsnor1-3 and nox1 plants was accelerated 
(Supplementary Fig. 4a, b). To determine the extent of cell death by 
hypersensitive response (CDHR), a smaller inoculum of Pst DC3000 
strains was used and the resulting leaves were stained with trypan 
blue, which marks dead or dying plant cells°. Relative to wild type, 
atgsnor1-3 and nox1 plants showed a prominent increase in cell death, 
but this response was markedly reduced in the atgsnor1-1 line (Fig. 
le-g). We corroborated these findings by quantifying cell-death- 
induced electrolyte leakage. Again, cell death was significantly greater 
in atgsnorl-3 and nox1 plants than in wild type, and there was a 
decrease in the atgsnor1-1 line (Fig. 1h, i and Supplementary Fig. 5). 
To confirm and extend these findings, we studied the effect of high 
SNO concentrations mediated by atgsnor1-3 on the hypersensitive 
response in the absence of SALICYLIC ACID INDUCTION 
DEFICIENT 2 (SID2) function, which is required for pathogen- 
triggered salicylic acid synthesis’*. There was no significant difference 
in the extent of cell death development in the atgsnor1-3 sid2 double 
mutant relative to the atgsnorl-3 line (Supplementary Fig. 6). 
Collectively, these findings imply that in atgsnor1-3 and nox! plants, 
the development of CDHR has accelerated kinetics and increased 
magnitude. However, in atgsnor1-1 plants, cell death development is 
reduced. Thus, despite diminished salicylic acid concentrations, a 
greater concentration of SNO positively regulates the development 
of the hypersensitive response mediated by at least two distinct R gene 
subclasses. 

CDHR does not seem to be required for limiting bacterial infec- 
tion®’>. Therefore, to identify a potential role for SNO-driven cell death 
in disease resistance, we challenged atgsnor1-3 plants with the avirulent 
oomycete Hyaloperonospora arabidopsidis isolate Emwal, which is 
recognized by RPP4 (ref. 16). The death of challenged host cells has 
been proposed as a key resistance mechanism against oomycetes”'”. As 
expected, cell death was more pronounced in atgsnor1-3 plants, relative 
to wild type, in response to Emwal (Supplementary Fig. 7). In addition 
to its role in cell death, salicylic acid is also a key immune activator, and 
plants defective in its accumulation routinely show diminished defence 
responses". Indeed, the Arabidopsis sid2 mutant, in which pathogen- 
induced salicylic acid accumulation is reduced'*'*, was compromised 
in RPP4-mediated resistance against Emwa| (Fig. 1j). By contrast, even 
though salicylic acid concentrations were equally reduced, relative to 
wild type, in atgsnor1-3 plants (Supplementary Fig. 8a—c), Emwal 
failed to complete its life cycle in these plants in the same way it did 
in the resistant wild-type line (Fig. 1j). An atgsnor1-3 sid2 double 
mutant also had increased resistance against Emwa1l relative to sid2 
plants (Fig. 1j). Moreover, although SNO and nitrite concentrations 
are higher in atgsnor1-3 plants, in sid mutants they are comparable to 
wild type (Supplementary Fig. 9a, b). Therefore, cell death develop- 
ment mediated by increased SNO is sufficient to convey resistance 
against Emwal in the absence of salicylic acid accumulation and asso- 
ciated defence responses. 
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NO function is thought to be closely interconnected with that of 
reactive oxygen intermediates (ROIs) in cell death development**®. We 
therefore monitored ROI accumulation activated by distinct R pro- 
teins in atgsnor1 and nox1 mutants, as determined by 3’,3’-diamino- 
benzidine (DAB) staining’. Relative to wild type, atgsnor1-3 and nox 
plants showed decreased pathogen-induced ROI accumulation, 
whereas mutant atgsnor1-1 plants accumulated more ROIs even in 
the absence of pathogen challenge (Fig. 2a, b and Supplementary 
Fig. 10a, b). Hence, in addition to being autonomous of salicylic acid, 
increased SNO concentrations may also facilitate CDHR indepen- 
dently of DAB-detectable ROI accumulation. To explore this possibil- 
ity further, we studied the effect of high SNO concentrations on the 
hypersensitive response in the absence of the NADPH-dependent 
oxidases AtRBOHD and AtRBOHF, which drive pathogen-induced 
ROI synthesis*. As expected, atrbohD and atrbohF single and double 
mutants showed decreased pathogen-induced CDHR than did wild- 
type plants, indicating that ROI synthesis is required for full develop- 
ment of the hypersensitive response (Fig. 2c and Supplementary Fig. 11). 
However, CDHR was not significantly different in atgsnor1-3 atrbohD, 
atgsnor1-3 atrbohF and atgsnor1-3 atrbohD atrbohF mutants than in 
atgsnor1-3 plants (Fig. 2c and Supplementary Figs 11 and 12a), despite 
the reduced ROI accumulation in these lines (Supplementary Fig. 12b). 
These results suggest that high SNO concentrations can facilitate 
CDHR independently of ROI synthesis mediated by AtRBOHD or 
AtRBOHF. 

Next we asked how SNO concentrations governed by AtGSNOR1 
are able to manipulate ROI concentrations. A1RBOHD is responsible 
for virtually all the DAB-detectable ROIs induced by avirulent strains 
of Pst DC3000°. Thus, it is plausible that SNO concentrations regulate 
AtRBOHD function. However, basal or pathogen-triggered changes in 
SNO concentration did not influence AtRBOHD protein concentra- 
tions, as determined by monitoring a Myc-AtRBOHD fusion protein 
detected by a Myc-specific antibody in atrbohD atgsnor1 double 
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Figure 2 | Increased SNO concentrations blunt NADPH oxidase activity 
and reduce ROI accumulation. a, ROI accumulation determined by DAB 
staining in the given Arabidopsis genotypes following challenge with Pst 
DC3000 (avrB). a.u., arbitrary units. b, Accumulation of ROIs in the stated 
plant lines in response to attempted Pst DC3000 (avrRps4) infection. c, Cell 
death development in atgsnor1 double and triple mutants in response to 
attempted Pst DC3000 (avrB) infection at 1 X 10°c.f.u. ml at 48h.p.i. A 
Student’s t-test comparing cell death in atgsnor1-3 with that in atgsnor1-3 
atrbohF plants, in a similar fashion to the other double and triple mutants, 
established that there was no statistically significant difference (P = 0.6466). 
d, NADPH oxidase activity in Arabidopsis following exposure to the given 
natural nitric oxide donors or related control treatments. AA at 480 nm, change 
in absorbance. e, NADPH oxidase activity in given Arabidopsis lines at 24 h.p.i. 
following challenge with Pst DC3000 (avrB). Data points represent mean + s.e. 
(n = 3). Avirulent strains of Pst DC3000 were infiltrated at 1 X 10°c.f.u. ml”? 
unless stated otherwise. 
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mutants (Supplementary Fig. 13). To explore whether SNOs could 
directly regulate NADPH oxidase activity, microsomal preparations 
from pathogen-challenged wild-type leaves were treated with the 
natural nitric oxide donor GSNO or S-nitroso-L-cysteine (Cys-NO), 
and NADPH oxidase activity was determined. Exposure to either 
GSNO or, to a lesser extent, Cys-NO significantly reduced the activity 
of this enzyme relative to the buffer control treatment (Fig. 2d). 
Furthermore, the absence of an effect following exposure to reduced 
glutathione (GSH) confirmed the specificity of this response. To deter- 
mine the possible biological consequences of these findings, we mea- 
sured the activity of this protein in atgsnor1-3, atgsnorl-1, nox1 and 
wild-type leaves challenged with Pst DC3000 (avrB). NADPH oxidase 
activity was significantly reduced in atgsnorl-3 and nox1 plants that 
have high SNO concentrations (Fig. 2e). Collectively, these findings 
suggest that changes in SNO concentrations can modulate NADPH 
oxidase activity, implying that this protein might be regulated by 
S-nitrosylation. 

To determine whether NADPH oxidase might be S-nitrosylated, 
recombinant protein was exposed to a range of either GSNO or Cys- 
NO concentrations typically used to score for S-nitrosylation in vitro’, 
and was monitored for the formation of SNO-AtRBOHD by the biotin- 
switch method”. AtRBOHD was S-nitrosylated in a concentration- 
dependent fashion by either GSNO or Cys-NO (Fig. 3a, b). 
Furthermore, the addition of dithiothreitol strikingly reduced the con- 
centration of SNO-AtRBOHD formation (Fig. 3b), consistent with the 
presence ofa reversible thiol modification. AIRBOHD contains a num- 
ber of cysteines that might serve as sites for this redox-based modifica- 
tion. The carboxy- and amino-terminal portions of this protein were 
therefore expressed, exposed to GSNO and subjected to biotin-switch 
analysis. Only the C-terminal portion of ALRBOHD was S-nitrosylated 
(Fig. 3c). The C-terminal portion of AIRBOHD has two cysteines: Cys 
825 and Cys 890. These residues were therefore mutated either 
individually or in combination, and the resulting proteins were 
expressed, treated with GSNO and analysed by the biotin-switch 
method. The Cys890Ala mutation but not the Cys825Ala mutation 
abolished S-nitrosylation of ALRBOHD (Fig. 3d). Mass spectrometry 
analysis was also consistent with S-nitrosothiol formation at Cys 890 
(Supplementary Fig. 14). Collectively, these findings imply that 
AtRBOHD is specifically S-nitrosylated in vitro on Cys 890. 

Cys 890 is evolutionarily conserved, suggesting that NADPH 
oxidases from other eukaryotes might also be S-nitrosylated (Sup- 
plementary Fig. 15). We therefore exposed recombinant human and 
Drosophila RBOH proteins to either GSNO or Cys-NO. Both of these 
proteins were specifically S-nitrosylated (Supplementary Fig. 16a, b). 
The site of SNO formation was Cys 537 for human NOX2 (also known 
as CYBB) and Cys 1315 for Drosophila Nox (Supplementary Fig. 16c, 
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Figure 3 | S-nitrosylation of AIRBOHD. a, S-nitrosylation of recombinant 
AtRBOHD in vitro by the nitric oxide donor GSNO. b, Cys-NO S-nitrosylates 
recombinant AtRBOHD and this is reversible by treatment with dithiothreitol 
(DTT). c, The C terminus of AtRBOHD is S-nitrosylated by GSNO. 

d, S-nitrosylation analysis of wild-type and mutant AtRBOHD derivatives. e, In 
vivo S-nitrosylation of wild-type and mutant derivatives of ALRBOHD in the 
given plant genotypes 24h after 5 X 10°c.f.u.ml~’ Pst DC3000 (avrB) 
infiltration. ALRBOHD was detected by virtue of its c-Myc tag using antibodies 
raised against this tag. AtRBOH proteins in a-d were examined 20 min post- 
exposure to nitric oxide donors. All experiments were repeated at least twice, 
with similar results. 
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d), both of which corresponded to Cys 890 of ALRBOHD (Supplemen- 
tary Fig. 15). Together, these data suggest that NADPH oxidases from 
at least two animals are specifically S-nitrosylated at this conserved 
cysteine, raising the possibility that this redox modification might 
regulate the activity of these enzymes in many other eukaryotes. 
Interestingly, an organizer protein, absent from plants, that interacts 
with animal NADPH oxidases may also be subject to S-nitrosylation in 
endothelial cells?*. Therefore, in mammals NADPH oxidase function 
might also be regulated indirectly by another reactive cysteine residue. 

To determine whether AtRBOHD is S-nitrosylated in vivo during 
the hypersensitive response, transgenic atrbohD lines expressing 
either Myc-tagged wild-type ALRBOHD or mutant derivatives were 
challenged with Pst DC3000 (avrB). Subsequently, endogenous proteins 
were subjected to biotin-switch analysis and biotinylated proteins 
purified with streptavidin beads. These proteins were then immuno- 
blotted with an anti-Myc antibody. Both wild-type AIRBOHD and the 
Cys825Ala mutant were S-nitrosylated, but the Cys890Ala mutant 
and the Cys825Ala Cys890Ala double mutant were not (Fig. 3e). 
Thus, AtRBOHD is specifically S-nitrosylated in vivo at Cys 890 
during the plant defence response. 

To understand whether S-nitrosylation of Cys 890 can modulate the 
activity of ALRBOHD, we first computationally modelled the structure 
of this protein. This indicated that Cys 890 is positioned closely behind 
the conserved Phe 921 and Phe 570 residues in ALRBOHD and NOX2, 
respectively. Similar to the homologous Tyr 247 in ferredoxin reduc- 
tase’, these residues are expected to have a significant role in binding 
flavin adenine dinucleotide (FAD). Accordingly, mutation of Phe 570 
was reported to impair NOX2 function’’. The model further predicts 
that S-nitrosylation of ALRBOHD at Cys 890 may disrupt the side- 
chain position of Phe 921, impeding FAD binding (Supplementary Fig. 
17a, b). To extend these findings, we determined the consequences of 
introducing an S-nitrosylated Cys 890 into our model. This disrupted 
the coplanar localization of Phe 921, thereby destabilizing or sterically 
ejecting FAD (Supplementary Fig. 18a, b). Consistent with these pre- 
dictions, we found that prior GSNO exposure markedly reduced the 
binding of this cofactor to AtRBOHD. Conversely, GSNO did not 
diminish FAD binding in the Cys 890 AtRBOHD mutant (Fig. 4a). 
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Figure 4 | The ALRBOHD Cys890Ala mutant shows increased activity 
during the defence response, amplifying ROI accumulation and cell death 
development. a, Quantification of FAD binding following prior GSNO or GSH 
treatment of wild-type ALRBOHD or the mutant Cys890Ala derivative. 

b, NADPH oxidase activity in plant extracts from the given Arabidopsis 
genotypes infiltrated with Pst DC3000 (white) or Pst DC3000 (avrB) (black) at 
24h.p.i.c, ROI accumulation determined by DAB staining in the leaves of given 
plant lines infiltrated with Pst DC3000 (white) and Pst DC3000 (avrB) (black) 
at 24h.p.i. d, Extent of cell death development determined by electrolyte 
leakage in the given wild-type and mutant Arabidopsis lines following challenge 
with Pst DC3000 (avrB) infiltrated at 1 X 10°c.£u. ml |. Data points represent 
mean = s.e. (n = 3). Unless stated otherwise, all strains of Pst DC3000 were 
infiltrated at 1 X 10°c.fu.ml |. 
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Thus, S-nitrosylation of AtRBOHD may preclude FAD binding 
inhibiting the activity of this enzyme. 

To explore the possible biological consequence of ALIRBOHD SNO 
formation at Cys 890 and the resulting loss of FAD binding, we 
monitored NADPH oxidase activity in mutant atrbohD plants expres- 
sing either a wild-type AtRBOHD transgene or the Cys890Ala deriv- 
ative. Pathogen-induced levels of leaf NADPH oxidase activity were 
significantly increased in the Cys890Ala mutant, relative to wild type 
(Fig. 4b). Furthermore, pathogen-induced ROI accumulation was 
greater in the Cys890Ala mutant line than in wild type (Fig. 4c). 
These data imply that S-nitrosylation of ALRBOHD at Cys 890 during 
the development of hypersensitive response blunts NADPH oxidase 
activity, resulting in decreased ROI generation. To determine the pos- 
sible impact of the ALRBOHD Cys890Ala mutation on cell death, we 
challenged plants possessing this mutation with Pst DC3000 (avrB) 
and scored the extent of their response. There was a significant 
increase in cell death in the ALRBOHD Cys890Ala line relative to wild 
type (Fig. 4d). Thus, enhanced ROI production in the Cys890Ala 
mutant, which is not a target for S-nitrosylation, facilitates CDHR. 
This work was confirmed and extended by determining in parallel 
SNO accumulation, NADPH oxidase activity and cell death develop- 
ment in the ALRBOHD Cys890Ala mutant line, relative to wild type, 
following attempted pathogen ingress (Supplementary Fig. 19a-c). 
Together, this information implies that S-nitrosylation of AARBOHD 
at Cys 890 promoted by increasing SNO concentrations serves to 
curb excessive cell death by blunting AtRBOHD-dependent ROI 
synthesis. 

These data establish a molecular framework for SNO function during 
the hypersensitive response. We speculate that on attempted infection, 
total cellular SNOs governed by AtGSNORI and ROIs synthesized by 
AtRBOHD*, in combination with salicylic acid accumulation", posi- 
tively regulate the development of cell death. However, as SNO con- 
centrations increase during the pathogen-triggered nitrosative burst, 
salicylic acid accumulation is reduced and further S-nitrosylation of 
AtRBOHD at Cys 890 decreases ROI synthesis. Collectively, this 
molecular dialogue may serve to limit cell death development during 
the hypersensitive response (Supplementary Fig. 20). 

Our findings are reminiscent of those in animals, where increasing 
concentrations of total cellular SNOs drive apoptosis in a variety of 
tissues'*. Conversely, S-nitrosylation of the pro-apoptotic regulators 
nuclear factor kB*° and caspase cysteine protease” function to con- 
strain cell death development. Furthermore, SNO formation at a con- 
served cysteine in NADPH oxidases from plants, humans and flies 
implies that this novel pro-survival mechanism might operate during 
immune function throughout such complex eukaryotes. 


METHODS SUMMARY 


Pathogen inoculations and small-molecule measurements. A variety of Pst 
DC3000 strains and H. arabidopsidis isolates were used for pathogen inoculations, 
as indicated. Trypan blue and DAB staining were conducted for detection of 
cell death and ROI accumulation, respectively”*. Salicylic acid was determined 
using a mini-scale procedure” based on high-pressure liquid chromatography. 
SNO contents were analysed using a chemiluminescence-based procedure”. 
Diaminofluorescein diacetate imaging was applied for the detection of nitric oxide. 
Electrolyte leakage was measured by using a DiST WP conductivity meter to 
quantify cell death. 

Gene constructs and transgenic plants. For cloning of NADPH oxidase genes 
and site-directed mutagenesis, the pGEX4T-1 vector and the QuickChange II Site- 
Directed Mutagenesis Kit (Stratagene) were used, respectively. Transgenes were 
generated by using the pUNI51 cloning vector and the pKYLX-myc9-loxp binary 
vector, and subsequently were introduced into Arabidopsis plants. See Methods for 
full details. 

Biochemistry and computational modelling. NADPH oxidase activity was 
determined by using epinephrine and NADPH as substrates. In vitro and vivo 
S-nitrosylation assays used either an anti-biotin or an anti-Myc9 antibody, respec- 
tively. Mass spectrometry was done using a capillary gas high-pressure liquid 
chromatography tandem mass spectrometry analysis system. For computational 
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modelling of NOX2 and AtRBOHD, we used Phyre, DeepView, SwissModel and 
Jpred as described in Methods. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Histochemical analysis and small-molecule measurements. Cells committed to 
die were visualized with lactophenol-trypan blue as described previously”. 
Peroxides were stained with DAB. Catalase effectively eliminated DAB staining. 
HO >-dependent DAB staining was observed in all the Arabidopsis lines described 
in this work. For quantification of either cell death or H,O, accumulation, the 
extent of either trypan blue or DAB staining, respectively, was determined in six 
leaves per line in arbitrary units, for each time point and from three independent 
experiments, using the ‘saturation’ function in PaintShop Pro 8 (Corel). 

Salicylic acid and salicylic acid B-glucoside concentrations were determined 
using a mini-scale procedure” based on high-pressure liquid chromatography. 
For the identification of SNOs, tissue extracts were generated at given times after 
pathogen inoculation and these samples were analysed by a chemiluminescence- 
based procedure (Sievers nitric oxide analyser) as described previously”’. 

For diaminofluorescein diacetate imaging, rosette leaves were incubated for 
15 min in a solution of 15 uM diaminofluorescein diacetate (Alexis) containing 
5mM MES KOH, pH 5.7, 0.25 mM KCl and 1 mM CaCl, and were then washed 
for 5 min (ref. 31). Fluorescent signals were detected using a Bio-Rad Radiance 
2100 confocal microscope (Nikon Eclipse, TE2000-U). The dye was excited at 
488nm, and images were collected at emission wavelengths of 500-530 nm. 
Green fluorescence-specific intensities were quantified using IMAGE] (version 
1.45h). 

Plant material and pathogen inoculations. Arabidopsis accession Col-0 and 
cognate mutants were grown under 16h of light at 22 °C and 8h of darkness at 
18°C. Pst DC3000 was grown, maintained and inoculated as described prev- 
iously**. Pst DC3000 strains were inoculated at the concentrations stated. H. 
arabidopsidis infections were as described previously’®. 

Electrolyte leakage. The protocol for electrolyte leakage was adapted from a 
previously described method**. Four-week-old plants were injected with bacteria 
in 10mM MgCh. Ten minutes after injection, 5.0-mm-diameter leaf disks were 
collected from the injected area and washed extensively with water for 10 min, and 
then ten discs were placed in a Petri dish with 6 ml of water. Conductivity mea- 
surements (three replicates for each treatment) were taken over time by using a 
DiST WP conductivity meter (HANNA Instruments). The units of this measure- 
ment are microsvedbergs per centimetre, where the distance refers to that between 
the electrodes. 

Cloning of NADPH oxidase genes and site-directed mutagenesis. The primers 
used to clone the N terminus (from K2 to N357; 356 amino acids) and the C 
terminus (from K756 to F921; 166 amino acids) of ALRBOHD, the C terminus 
(from N430 to E540; 111 amino acids) of Homo sapiens NOX2 (gi_163854302) 
and the C terminus (from 11198 to E1338; 141 amino acids) of Drosophila 
melanogaster Nox (gi_161077139) are as follows. AtRBOHD N terminus: 
5'-AAGGATCCAAAATGAGACGAGGCAATTCA-3’ (forward); 5’-TTCTAG 
TTGCTCTCTTTTGCCGGTCT-3’ (reverse). ALRBOHD C terminus: 5'-AAG 
GATCCAAGGACATCATCAACAACATG-3’ (forward); 5'-TTCTAGAAGTT 
CTCTTTGTGGAA-3’ (reverse). NOX2 C terminus: 5’-GTAATGGATCCAAC 
GCCACCAATCT-3’ (forward); 5’-TATGCTCTCGAGTCATTCAGGTCCACA 
GA-3’ (reverse). Nox C terminus: 5'-GAAGAGCAAAAAGCGGAGTC-3’ (for- 
ward); 5'-GGATTTGCCTTTCGTAAGGA-3’ (reverse). The amplified PCR pro- 
ducts were cloned into pGEX4T-1 vector at the sites of BamHI/EcoRI, BamHI/ 
Xhol and EcoRI/Xhol, respectively. 

Site-directed mutagenesis was carried out with QuickChange II Site-Directed 
Mutagenesis Kit (Stratagene). All procedures followed the manufacturer’s manual 
and specific primers used for AtRBOHD, NOX2 and Nox mutations are as follows. 
AtRBOHD C825A: 5’-GAGCTTCACAATTATGCCACGAGTGTGTACGA-3' 
(forward); 5'-TCGTACACACTCGTGGCATAATTGIGAAGCTC-3’ (reverse). 
AtRBOHD C890A: 5'-ATAGGAGTCTTCTACGCTGGAATGCCAGGAAT-3' 
(forward); 5'-ATTCCTGGCATTCCAGCGTAGAAGACTCCTAT-3’ (reverse). 
NOX2 C537A: 5'-ATAGGAGTTTTCCTCGCCGGACCTGAATGACTC-3’ (for- 
ward); 5’-GAGTCATTCAGGTCCGGCGAGGAAAACTCCTAT-3’ (reverse). 
Nox C1315A: 5'-GICACCGICTTCTACGCCGGCCCACCACAGTTG-3' (for- 
ward); 5’-CAACTGITGGTGGGCCGGCGTAGAAGACGGTGAC-3’ (reverse). 
Underlining denotes the codon of alanine. 

Transgenic plant materials. The atrbohD line was complemented with a wild- 
type copy of AtRBOHD and also transformed with C825A, C890A or the double 
mutant. Briefly, a wild-type copy and a mutant copy of ALRBOHD were amplified 
by using forward primer 5’-CGGAATTCGGATGAAAATGAGACGAGGCAA 
TTCA-3’ and reverse primer 5’-CGGGATCCTAGAAGTTCTCTTTGTGGA 
AGTC-3’, and cloned into pUNI51 vector (EcoRI/BamHI). Subsequently, the 
pUNI51 constructs harbouring AtRBOHD were recombined with pKYLX- 
myc9-loxP binary vector by a Cre recombinase™*. The resulting constructs were 
introduced into Agrobacterium strain GV3101 and subsequently transformed into 
Arabidopsis plants. Transgenic T1 plants were identified by kanamycin selection. 


NADPH oxidase biochemistry. NADPH oxidase activity was measured as 
described previously**. Briefly, 1g of leaf tissue was ground in liquid nitrogen 
and dissolved in extraction buffer (0.25M sucrose, 50mM HEPES, pH 7.2, 
3mM EDTA, 1mM dithiothreitol, 0.6% PVP, 3.6mM L-cysteine, 0.1mM 
MgCl, and protease inhibitor tablet (Roche)). The crude extract was centrifuged 
at 10,000g for 30min and the resulting supernatant was ultracentrifuged at 
203,000g for 1h. The resulting pellet was resuspended in extraction buffer and 
used as the membrane fraction to measure NADPH oxidase activity spectropho- 
tometrically at 480 nm using epinephrine and NADPH as substrates. 
Expression and purification of recombinant proteins. Recombinant proteins 
were expressed in Escherichia coli strain BL21 (DE3) by adding 0.3 mM IPTG with 
a 6-h incubation. The GST-tagged proteins were purified using a MagneGST 
Protein Purification System (Promega). All procedures followed the manufac- 
turer’s manual. 

In vitro and vivo S-nitrosylation assays. Recombinant proteins were in vitro 
S-nitrosylated with the stated concentration of the given nitric oxide donor in 
500 ul volumes for 20 min in darkness. Donors were removed using Micro Bio- 
Spin P6 columns (BioRad) and the resulting proteins were subjected to the biotin- 
switch technique” by western blot assay using anti-biotin antibody (New England 
Biolab). For in vivo assay, Arabidopsis plants were inoculated with Pst DC3000 
(avrB) at 10’ c.fu.ml | and an anti-Myc9 antibody (Sigma) was used. 

Mass spectrometry. All chemicals were purchased from Sigma-Aldrich unless 
otherwise stated. Acetonitrile and water for liquid chromatography tandem mass 
spectrometry and sample preparation were of high-pressure liquid chromato- 
graphy quality (Fisher). Formic acid was Suprapure 98-100% (Merck) and 
trifluoroacetic acid was 99% purity sequencing grade. Sequencing-grade modified 
porcine trypsin was purchased from Promega and GluC from Worthington 
(Lorne Lab). All high-pressure liquid chromatography mass spectrometry (LC- 
MS) connector fittings were from Upchurch Scientific or Valco (Hichrom and 
RESTEK). 

As S-nitrosothiol formation at Cys 890 of ALRBOHD was found to be relatively 
labile in vitro, we used a well established method” that utilizes iodoacetamide to 
form a carbamidomethy]l ion at sites of S-nitrosothiol formation, which are not 
blocked by treatment with methyl methanethiosulphonate (MMTS). This analysis 
revealed striking S-nitrosylation at Cys 890 of AtRBOHD but the complete 
absence of S-nitrosothiol formation at Cys 825 following treatment with a nitric 
oxide donor. 

The GST-AtRBOHD C terminal was expressed in E. coli BL21 and purified 
using glutathione Sepharose 4B (GE Healthcare) in native condition. The purified 
protein solution was desalted using a Zeba Desalt spin column (Thermo 
Scientific). The desalted protein was treated with or without Cys-NO (final con- 
centration, 0.5 mM) in HENS buffer for 20 min at room temperature (25 °C). Cys- 
NO was then removed using a Zeba desalt column and the free cysteines were 
blocked by MMTS in HENS buffer with 2.5% SDS for 20 min at 50 °C. The treated 
protein was precipitated by acetone and resuspended in HENS buffer with 1% 
SDS. Sodium ascorbate (10mM) and iodoacetamide (50mM) were added to 
remove S-NO bonds and for protein acylation. The proteins were separated by 
SDS-PAGE and the protein gel was excised, cleaned and digested with GluC and 
Trypsin at 37 °C for 16 h. The digested peptides were blocked first with MMTS and 
analysed by LC-MS. 

Capillary gas high-pressure liquid chromatography tandem mass spectrometry 
(MSMS) analysis was performed on an online system consisting of a micropump 
(1200 binary HPLC system, Agilent) coupled to a hybrid LTQ-Orbitrap XL instru- 
ment (Thermo-Fisher). The LTQ was controlled through XCALIBUR 2.0.7. 
Samples were reconstituted in 10 ul loading buffer before injection and analysed 
ona 1-h gradient for data-dependent analysis. 

MSMS data were searched using MASCOT, versions 2.2 and 2.3 (Matrix 
Science Ltd), against a small database comprising the most common contaminant 
and various constructs. Variable methionine oxidation and cysteine methylthiol 
and carbamidomethylation were considered in all analyses. The precursor mass 
tolerance was set to 7p.p.m. and the MSMS tolerance was set to 0.4 AMU. 
Fragmentation patterns were confirmed using ProteinProspector (http://prospector. 
ucsf.edu). Label-free quantitation was performed using PROGENESIS (Nonlinear 
Dynamics). For label-free quantitation, the number of ‘features’ (that is, signal at a 
specific retention time and m/z) was reduced to only MSMS peaks with a charge of 
2,3 or 4+ and only the five most intense MSMS spectra per feature” were kept. Sets 
of multicharged ions (2+, 3+, 4+) were extracted from each LC-MS run. 
FAD-binding activity assay. FAD-binding activity was measured as described 
previously*’. Briefly, proteins purified using a MagneGST Protein Purification 
System (Promega) were incubated with either GSNO or GSH for 20 min in the 
dark, and the excess GSNO or GSH was removed using a Bio-Spin6é column (Bio- 
Rad). The resulting compounds were further incubated with FAD (250 1M) for 
30 min in the dark. Unbound FAD was removed using a Bio-Spin6 column, and 
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the FAD content was determined by boiling the resulting protein samples for 
5 min in the dark, followed by centrifugation at 14,000g for 10 min to remove 
coagulated protein. The absorbance of the released FAD was measured at 450 nm. 
Computational modelling of NOX2 and AtRBOHD. The C-terminal 287 
amino acids of NOX2 were used for identifying structural homologues with 
Phyre**. This identified PDB crystal structure 1FDR, encoding an E. coli flavodoxin 
reductase, as a potential homologue. Structural alignments were optimized using 
Deep View and submitted for threading over 1FDR at the SwissModel server”. The 
resulting computational model contained several loop regions unique to NOX2. 
These regions were analysed for secondary structure using Jpred*® and the 
DeepView loop library. The computational model of the ALRBOHD C terminus 
was built by threading over both 1FDR and NOX2. 

The model indicates that Cys 890 is positioned closely behind the conserved Phe 
921 and Phe 570 residues in ALRBOHD and NOX2, respectively. Similar to the 
homologous Tyr247 in ferredoxin reductase”, these residues are expected to have 
a significant role in FAD binding. Accordingly, mutation of Phe 570 has been 
reported to impair NOX2 function®. The model further predicts that 
S-nitrosylation of ALRBOHD at Cys 890 may disrupt the side-chain position of 
Phe 921, impeding FAD binding. 
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CTCF-promoted RNA polymerase II 
pausing links DNA methylation to splicing 


Sanjeev Shukla’, Ersen Kavak”?, Melissa Gregory’, Masahiko Imashimizu‘, Bojan Shutinoski!, Mikhail Kashlev‘, 
Philipp Oberdoerffer’, Rickard Sandberg? & Shalini Oberdoerffer’ 


Alternative splicing of pre-messenger RNA is a key feature of transcriptome expansion in eukaryotic cells, yet its 
regulation is poorly understood. Spliceosome assembly occurs co-transcriptionally, raising the possibility that DNA 
structure may directly influence alternative splicing. Supporting such an association, recent reports have identified 
distinct histone methylation patterns, elevated nucleosome occupancy and enriched DNA methylation at exons relative 
to introns. Moreover, the rate of transcription elongation has been linked to alternative splicing. Here we provide the 
first evidence that a DNA-binding protein, CCCTC-binding factor (CTCF), can promote inclusion of weak upstream 
exons by mediating local RNA polymerase II pausing both in a mammalian model system for alternative splicing, CD45, 
and genome-wide. We further show that CTCF binding to CD45 exon 5 is inhibited by DNA methylation, leading to 
reciprocal effects on exon 5 inclusion. These findings provide a mechanistic basis for developmental regulation of splicing 


outcome through heritable epigenetic marks. 


It is estimated that greater than 90% of human genes undergo alterna- 
tive splicing of pre-mRNA" and aberrant splicing has been impli- 
cated in a number of human diseases’. Alternative splicing decisions 
are determined by the ability of weak splice sites to effectively compete 
with strong splice sites for detection by the spliceosome*. The balance 
between splice site selection is principally influenced by two vari- 
ables”: (1) the availability of splicing factors that detect enhancer or 
silencer sequences encoded within nascent RNA*® and (2) the rate of 
RNA polymerase II (pol II) transcription elongation, wherein a slow 
rate favours co-transcriptional spliceosome assembly at weak splice 
sites”®. 

A surprising result of genome-wide chromatin-immunoprecipitation- 
sequencing (ChIP-seq) studies is the non-random distribution of several 
epigenetic marks in exons relative to introns. In particular, exons show 
elevated nucleosome density, DNA methylation of cytosine, and over- 
representation of certain histone modifications, relative to introns’’’. 
Differential ‘marking’ of exons on DNA highlighted a possible connec- 
tion between DNA structure and co-transcriptional RNA proces- 
sing. Accordingly, several recent studies suggest that exonic histone 
modification may affect variable inclusion of alternative exons'*”. 
Collectively, these studies raise the intriguing possibility that epigenetic 
modifications are maintained on DNA to aid the spliceosome in the 
process of exon definition’”’, and that differential chromatin assembly 
may represent a critical aspect of alternative splicing regulation. 

Processing of CD45 pre-mRNA (also known as PTPRC) is a well 
established model system to study the regulatory mechanisms of 
alternative splicing. CD45 is a trans-membrane protein tyrosine phos- 
phatase that initiates signalling through antigen receptors by depho- 
sphorylating the inhibitory tyrosine on Src family kinases'’. Variable 
exclusion of exons 4-6 (A-C) of CD45 transcripts is tightly correlated 
with stages of lymphocyte development and expressed splice variants 
can be distinguished using isoform-specific antibodies and flow cyto- 
metry’? (Supplementary Fig. 1). In general, the larger, exon 4-containing 
isoforms (CD45RA) are expressed early in peripheral lymphocyte 


development, whereas the shortest isoform (CD45RO), which lacks all 
three variable exons, is expressed in terminally differentiated lympho- 
cytes’. We recently identified heterogeneous ribonucleoprotein L-like 
(hnRNPLL) as a tissue-specific master regulator of the CD45RA to 
CD45RO transition in peripheral lymphocytes”. hnRNPLL binds to 
exons 4 and 6 of CD45 mRNA and blocks the inclusion of both exons 
in the mature message’. In contrast, hnRNPLL expression does not 
influence exclusion of exon 5 (refs 20, 22; Supplementary Fig. 2a). In 
vitro studies aimed at uncovering regulators of exon 5 exclusion have 
identified several ubiquitously expressed splicing factors”. However, 
peripheral lymphocytes retain exon 5 until the terminal stages of 
development™* (Supplementary Fig. 2b), portending a yet uncovered 
layer of regulation. 


CTCF regulates exon 5 inclusion in CD45 mRNA 
Considering the growing evidence for DNA-mediated regulation of 
spliceosome assembly, we explored the hypothesis that exon 5 inclu- 
sion is mediated by the epigenetic structure of the gene encoding CD45. 
By analysing published ChIP-seq data within the UCSC genome 
browser*>”®, we identified a strong CTCF peak overlapping with exon 
5 across cell types. CTCF is an 11 zinc-finger DNA-binding protein 
with multiple nuclear functions, largely grouped into two categories: 
insulating inactive regions of the genome from active regions and 
promoting long-range interactions between distal regions of the 
genome’’**. Whereas intergenic CTCF is an effective barrier to tran- 
scription”’, we found that CTCF binding at exon 5 is maintained in cells 
that actively transcribe abundant CD45 (ref. 26) and that exon 5 bind- 
ing is conserved in murine splenocytes (Fig. la and Supplementary 
Fig. 2c), indicating an important, position-dependent ‘non-insulator’ 
function. We thus explored whether and how CTCF binding at CD45 
exon 5 DNA influences processing of CD45 transcripts. 

To dissect the impact of CTCF on exon 5 splicing in a cell-based 
system, we screened several human Burkitt lymphoma B cell lines for 
differences in expression of the exon 5 containing CD45RB isoform. 


Center for Cancer Research, Mouse Cancer Genetics Program, National Cancer Institute at Frederick, Frederick, Maryland 21702, USA. Department of Cell and Molecular Biology, Karolinska Institutet, SE- 
171 77 Stockholm, Sweden. Ludwig Institute for Cancer Research, SE-171 77, Stockholm, Sweden. *Center for Cancer Research, Gene Regulation and Chromosome Biology Laboratory, National Cancer 
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Figure 1 | Binding of CTCF to exon 5 of CD45 DNA is associated with 
inclusion of exon 5 in CD45 transcripts. a, CTCF ChIP in murine splenocytes 
and quantitative PCR (qPCR) relative to rabbit Ig control ChIP (n = 2). b, Cell- 
surface staining for CD45RA (exon 4-containing) and CD45RB (exon 
5-containing) isoforms and total CD45 (pan) in parental BL41 B cells (RB), 
cell-culture-derived CD45RB bimodal cells, and CD45RB low (RB) cells 


Whereas lymphocyte cell lines generally express high levels of 
CD45RB (Supplementary Fig. 2d, e), culturing BL41 cells in non- 
heat-inactivated fetal bovine serum (FBS) resulted in bimodal 
CD45RB expression. CD45RB low cells (RB!°’) were sorted from the 
bimodal population and stably maintained (Fig. 1b). Parental BL41 cells 
(RB™2") and sorted RB cells express equivalent exon 4-containing 
CD45RA isoforms and total CD45 (Fig. 1b), indicating specific exclu- 
sion of exon 5 in the RB” population. Quantitative RT-PCR with exon 
junction spanning primers validated exon 5 skipping: RB” cells 
showed reduced exon 4/5 and exon 5/6 junctions, but enhanced exon 
4/6 junctions relative to RB"®" cells (Fig. 1c). Notably, several histone 
modifications that that have been previously linked to alternative splic- 
ing (H3K36me3, H3K27me3, H3K4me3)'*” are equivalently detected 
at exon 5 in RB"®" and RB cells (Supplementary Fig. 3a, b). CTCF- 
ChIP in the newly identified RB?) and RB BLA1 cells, and CD45RB- 
high BJAB cells revealed a strong positive correlation between exon 5 
inclusion in CD45 mRNA and CTCF binding at CD45 exon 5 DNA 
(Fig. lb-d and Supplementary Fig. 2e), particularly in BJAB cells, which 
also express elevated CTCF protein (Supplementary Fig. 3c). In agree- 
ment with the observation that exon 5 splicing is independent of 
hnRNPLL, modulation of hnRNPLL expression did not influence 
CTCF binding to CD45 exon 5 (Supplementary Fig. 3d). 

To assess whether the association between CTCF binding and exon 
5 inclusion reflects a direct role for CTCF in CD45 alternative splicing, 
we used RNA interference to deplete CTCF from our B cell lines (Sup- 
plementary Fig. 4a). Decreasing CTCF levels in bimodal BL41 cells led 
toa marked loss of CD45RB expression without reducing overall CD45 
levels (Fig. 2a and Supplementary Fig. 4b). Similarly, CTCF depletion 
in RB cells and BJAB cells led to a substantial loss of CD45RB 
staining with little effect on overall CD45 levels (Fig. 2a). Quanti- 
tative RT-PCR of CD45 mRNA in CTCF-depleted RB°” and BJAB 
cells validated reduced exon 5 expression and increased exon 4/6 junc- 
tions (Fig. 2b, c, respectively; Supplementary Fig. 4c, additional trans- 
ductions), confirming that CTCF mediates exon 5 inclusion. 


CTCF promotes pol II pausing at CD45 exon 5 


We next investigated the mechanism by which CTCF binding to CD45 
DNA influences mRNA splicing outcomes. Given that genome-wide 
ChIP-seq studies have revealed overlapping intragenic CTCF and pol 
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sorted from the bimodal BL41 population. c, GAPDH-normalized qRT-PCR 
data from RB" and RB" cells using the indicated junction-spanning primers 
(n = 3). d, CTCF ChIP in BJAB, RB™®" and RB cells and qPCR for CD45 
exons and introns (m = 3-6). All graphs show mean values + standard 
deviation (s.d.). P = two-tailed Student’s t test comparing the indicated 
samples. 


II peaks*, we examined whether CTCF promotes inclusion of exon 5 
through interference with pol II elongation. ChIP confirmed signifi- 
cant enrichment of pol II at CD45 exon 5 DNA, but not at adjacent 
regions in RB"® cells as compared to RB'™ cells (Fig. 3a). Using 
antibodies specific to pol II phosphorylated on the carboxy-terminal 
domain (CTD), we further showed that elevated pol II at CD45 exon 5 
in RB"®" cells is associated with the elongating form phosphorylated 
on serine 2 of CTD YSPTSPS heptad repeats*’ (Supplementary Fig. 5a, 
b). Notably, CTCF depletion from RB"®" cells (Supplementary Fig. 5c, 
d) reduced both CTCF binding (Fig. 3b) and pol II levels at CD45 exon 
5 (Fig. 3c). 
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Figure 2 | CTCF depletion leads to reduced exon 5 inclusion in CD45 
transcripts. a, Cell-surface CD45RB isoform and total CD45 expression in cells 
transduced with short hairpin RNA (shRNA)against CTCF (CTCF-sh3 and/or 
sh-4) or control shRNA against red fluorescent protein (RFP). b, c, RT-PCR 
in CTCF-depleted RB” (b) and BJAB cells (c) froma to detect CD45 (left) and 
CTCF (right) mRNA levels (n = 3). Graphs show mean values + s.d. P, two- 
tailed Student’s t test. 
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Figure 3 | CTCF binding at CD45 exon 5 DNA facilitates exon 5 inclusion in 
CD45 transcripts through local pol II pausing. a, RNA pol II ChIP and qPCR 
relative to mouse Ig control IP (n = 3). b, CTCF ChIP in RB"®" cells transduced 
with shRNA against CTCF versus shRFP-transduced cells and qPCR relative to 
rabbit Ig control IP (n = 2). c, RNA pol II ChIP of RB™2" cells from b and qPCR 
relative to mouse Ig control IP (n = 2). d, In vitro transcription with a DNA 
oligo incorporating a CTCF binding site at position 26 relative to elongation 
complex assembly. Recombinant CTCF and TFIIS protein were introduced as 
indicated, with variable effects on pausing at adenine 21 (A21). 


The above data definitely link CTCF binding, pol II pausing and 
exon 5 inclusion, but do not exclude additional, context-dependent 
secondary effects. To query whether CTCF binding to an actively 
transcribed template is sufficient to promote pol II pausing, we 
assembled a pol II ternary elongation complex from synthetic DNA 
and RNA oligonucleotides and highly purified yeast pol II’. A CTCF 
binding site was incorporated into the template DNA at position 26 
relative to the hybridization location of a 9-nucleotide RNA primer 
(Fig. 3d). CTCF binding to the target sequence was confirmed by 
electrophoretic mobility shift assay (EMSA) (Supplementary Fig. 6a). 
Incubation with pol II and increasing amounts of recombinant CTCF 
resulted in pausing immediately upstream of the CTCF binding site 
(Fig. 3d). Extended incubation or introduction of the elongation factor 
TFHS substantially reduced pausing and led to near complete escape of 
paused pol II (Fig. 3d and Supplementary Fig. 6b). Thus, CTCF can 
autonomously promote pol II pausing, but not complete arrest, on a 
naked DNA template. These data establish CTCF as a direct impedi- 
ment to transcription that can act in the absence of a particular nucleo- 
some structure or chromatin context. Furthermore, the ability of 
paused pol II to resume transcription efficiently in the presence of 
CTCF supports a physiological role for CTCF in favouring exon inclu- 
sion through transient, spatiotemporal pol II pausing. 

Having demonstrated that CTCF can promote pol II pausing, we 
explored the relationship between CTCF, pol II and exon inclusion in 
a tractable, endogenous system. We generated a wild-type minigene 
extending from intron 3 through intron 7 of human CD45 genomic 
DNA (13-17), as well as a mutant analogue, in which the exon 5 CTCF 
binding site was disrupted through nucleotide substitution (I3- 
17*CTCF) (Fig. 3e and Supplementary Fig. 6c). The 11 zinc fingers 
of CTCF support multiple contacts to substrate DNA**** and a mini- 
mum of five substitutions within the core motif*? were required to 
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e, Representation of CD45 minigenes with wild-type (13-17) or mutated exon 5 
CTCF binding site (I3-I7*CTCF), used in f-j. f, CTCF-ChIP in NIH3T3 and 
CHO cells transfected with the CD45 minigenes and qPCR relative to rabbit Ig 
control IP. Error bars represent standard error of the mean (s.e.m.) ( = 3). 
g-i, RT-PCR from minigene-transfected HEK293, NIH3T3 and CHO cells to 
detect the junctions of exons 4/5 (g), 5/6 (h) and 4/6 (i) relative to exon 6 

(n = 3).j, RNA pol II ChIP in CHO cells transfected with the CD45 minigenes 
and qPCR relative to mouse Ig control IP (nm = 3). Unless indicated otherwise, 
graphs show mean values + s.d. P, two-tailed Student’s ¢ test. 


significantly ablate CTCF binding (EMSA, Supplementary Fig. 6d). 
To avoid detection of endogenous CD45, which is confined to the 
haematopoietic lineage’*, the minigenes were transfected into several 
fibroblast cell lines. In addition to human HEK293 cells, murine 
NIH3T3 and hamster CHO cells were used to specifically amplify 
human minigene CD45 DNA in ChIP analyses. CTCF ChIP of trans- 
fected NIH3T3 and CHO cells confirmed robust binding to exon 5 of 
the I3-I7 minigene, and complete disruption of binding to the mutated, 
13-I7*CTCF minigene (Fig. 3f). Quantitative RT-PCR indicated that 
both minigenes were comparably expressed and approximated endo- 
genous CD45 levels in immune cells (Supplementary Fig. 6e). 
Mutation of the CTCF binding site in exon 5 led to a marked decrease 
in 4/5 and 5/6 junctions, and increase in 4/6 junctions in all three cell 
types (Fig. 3g, h, i, respectively), resulting in an overall 50-100 
decrease in exon 5 inclusion. Notably, ChIP confirmed increased pol 
II occupancy at exon 5 in the I3-I7 minigene, but not in the mutated, 
13-I7*CTCF minigene (Fig. 3j). As the two minigenes are identical in 
every regard minus the five core nucleotides of the CTCF binding site, 
these data establish CTCF as a direct regulator of CD45 exon 5 inclu- 
sion, which operates through promoting local pol II pausing. 


DNA methylation inhibits exon 5 CTCF binding 


Armed with the knowledge that CTCF binding to exon 5 DNA reg- 
ulates inclusion, and given that exon 5 is variably excluded during 
lymphocyte maturation, we asked whether and how CTCF binding is 
modulated to influence splicing outcome. Whereas CTCF is ubiqui- 
tously expressed, binding to DNA is inhibited by methylation on CpG 
dinucleotides*”**. Several recent studies have shown that DNA 
methylation is substantially enriched at exons relative to introns'*!>”*, 
suggesting a role in pre-mRNA processing, yet a causal relationship 
between these processes has not been demonstrated. Methylated 
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DNA immunoprecipitation (MedIP) in our B-cell lines suggested that 
CTCF binding at CD45 exon 5 and associated exon inclusion are 
indeed regulated by DNA methylation: we detected a strong inverse 
correlation between CTCF and 5-methylcytosine at CD45 exon 5, but 
not at adjacent exons (Figs 1d, 4a). To assess whether DNA methyla- 
tion of CD45 exon 5 and reciprocal loss of CTCF binding contribute 
to exon 5 exclusion during the transition from naive to mature T 
lymphocytes, CD3* T cells were isolated from human peripheral 
blood and sorted into RB" 2" (naive) and RB™°*"™ (mature) popula- 
tions’’ (Fig. 4b). MedIP confirmed significant enrichment of CD45 
exon 5 methylation (Fig. 4c) and reduced CTCF binding (Fig. 4d) in 
RB™<4"™ cells compared to RB™®" peripheral T cells. Thus, CTCF 
binding and CD45 exon 5 inclusion are inversely related to DNA 
methylation in several transformed cell lines and in primary T cells. 

To determine whether dynamic methylation of CD45 exon 5 DNA 
is a regulatory mechanism contributing to CD45 alternative splicing, 
we modulated methylation through inhibition of the DNA mainten- 
ance methyltransferase, DNMT1. We reasoned that, if elevated exon 5 
methylation and consequent reduced CTCF binding were the prin- 
cipal components distinguishing RB” and RB™®" cells, inhibition of 
methylation should cause RB'™ cells to revert to an RB™®" phenotype. 
Indeed, DNMT1 depletion in RB’ cells (Supplementary Fig. 7a, b) 
reduced 5-methylcytosine levels (Fig. 4e) and restored CTCF binding 
at CD45 exon 5 (Fig. 4f), leading to enhanced exon 5 inclusion in 
CD45 mRNA, as evidenced by increased exon 4/5 and 5/6 junctions 
(Supplementary Fig. 7c) and cell-surface CD45RB (Fig. 4g). Notably, 
increasing CTCF binding in RB' cells through reduced exon 5 
methylation also reinstated local pol II pausing (Fig. 4h). In addition 
to identifying dynamic DNA methylation as a possible regulatory 
mechanism governing CD45 alternative splicing in vivo, these data 
establish CTCF as the first mechanistic link between DNA methyla- 
tion and alternative pre-mRNA splicing. 


Global effects of intragenic CTCF on splicing 


Although studies of CTCF function have been largely restricted to 
intergenic activities, CICF ChIP-seq studies found that approxi- 
mately 40-45% of CTCF binding sites are located intragenically***””*. 
Based on our observations with CD45, we propose that some portion 
of intragenic CTCF binding sites operate to influence pre-mRNA 
processing decisions. To globally address the impact of CTCF on 
alternative splicing, we performed CTCF ChIP-seq in BL41 and 


a b 


BJAB cells to produce cell-type-specific CTCF binding maps, and 
high-throughput RNA-sequencing (RNA-seq) of total RNA from 
CTCF-depleted BL41 and BJAB cells and their relevant controls 
(CTCF-sh3, Fig. 2a). Mapping of overall CTCF binding sites in 
BL41 and BJAB cells indicated comparable distribution patterns to 
previous reports (Supplementary Table 2). The mixture of isoforms 
(MISO) model was applied to RNA-seq data (Supplementary Table 3) 
to identify exons with a high probability of differential expression in 
response to CTCF depletion, as assessed by the Bayes factor con- 
fidence index*’. Exons showing altered inclusion in response to 
CTCF depletion were further subdivided into three categories based 
on proximity to a local CTCF binding site: unbound by CTCF, or 
CTCF-bound within 1 kilobase downstream or upstream of the exon 
(Fig. 5a and Supplementary Table 4). CTCF is a global regulator of 
transcription’’ and depletion would be expected to result in some level 
of alternative splicing due to alterations in upstream pathways. 
Accordingly, MISO identified exons that were differentially included 
in mRNA in response to CTCF depletion, but were not locally bound 
by CTCF on the corresponding DNA. Importantly, in BL41 and BJAB 
cells, alternative exons not bound by CTCF were centred at zero 
across Bayes factor thresholds, indicating that secondary effects of 
CTCF depletion showed no preference towards exon inclusion or 
exclusion (Fig. 5b, c, respectively). Similarly, CTCF binding upstream 
of the differentially expressed exon did not show a statistically signifi- 
cant bias towards exon inclusion or exclusion (Fig. 5b, c and Sup- 
plementary Fig. 8a, b). However, we detected a strong correlation 
between CTCF depletion and exon exclusion if CTCF is bound down- 
stream of the alternative exon in both BL41 and BJAB cells (Fig. 5b, c 
and Supplementary Fig. 8a, b). We additionally identified CTCF- 
bound exons that showed reduced inclusion in BL41 and BJAB cells, 
as well as unique examples, indicating a degree of cell-type specificity 
(Supplementary Figs 8c, 9a and Supplementary Table 4). 

These genome-level data are consistent with our observations in the 
CD45 model system, wherein CTCF binding downstream of the weak 
3’ splice site flanking exon 5 promoted inclusion of exon 5 in mature 
message, but had no effect on exon 6 (Figs 1c, d and 2b, c). As we had 
mechanistically linked CT'CF-associated pol II pausing to CD45 exon 
5 inclusion, we examined pol II occupancy at the downstream CTCF 
sites that led to reduced exon inclusion upon CTCF depletion. 
Inspection of publicly available CTCF ChIP-seq data from CD4* T 
cells*” indicated high conservation of these CTCF binding sites”®. 
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Figure 4 | 5-methylcytosine levels (5-mC) are inversely related to CTCF 
binding and exon 5 inclusion. a, Methylated DNA immunoprecipitation 
(MedIP) in B cell line genomic DNA and qPCR relative to input (n = 5). 

b, Representative CD45 isoform expression in primary peripheral human 
CD3~ T cells sorted on the basis of cell-surface CD45RB and CD45RO. 

c, MedIP and qPCR relative to input in sorted primary human CD3* T cells 
(n = 6, compiled from two donors). d, CTCF-ChIP and qPCR relative to rabbit 
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Ig control IP, in sorted primary CD3* T cells (n= 2). e, MedIP and qPCR 
relative to input in BL41 RB’ cells transduced with shRNA against DNMT1 
versus shRFP-transduced cells (n = 3). f, CTCF ChIP in cells from e and qPCR 
relative to rabbit Ig control IP (n = 3). g, Cell-surface CD45RB expression in 
cells from e. h, RNA pol II ChIP and qPCR in cells from e relative to mouse Ig 
control IP (1 = 3). Unless indicated otherwise, graphs show mean values + s.d. 
P, two-tailed Student’s f test. 
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Figure 5 | Global identification of CTCF-dependent exons. a, Alternative 
exons were classified on the basis of the relative location of an exclusive CTCF 
peak within 1 kb of the exon. b, Difference in the mean exon inclusion level 
between bimodal BL41 cells transduced with shRNA against CTCF versus 
shRFP-transduced cells (from Fig. 2a) for exons with CTCF peak in upstream 
(blue) or downstream regions (red) but not in the exon body and for exons with 
no CTCF binding (black). The mean ~ s.e.m. for each class of exons is plotted 


Analysis of the corresponding pol II ChIP-seq data’ revealed a stronger 
enrichment of pol II at downstream CTCF binding sites relative to 
upstream exons (Fig. 5d). Enrichment of pol II occupancy at CTCF 
binding sites compared to associated upstream alternative exons was 
confirmed for several genes in BJAB and BL41 cells (Supplementary 
Fig. 9b). Together with our CD45 data, we conclude that CTCF bound 
downstream of alternative exons promotes pol II pausing, providing 
the necessary temporal context for co-transcriptional spliceosome 
assembly at weak upstream splice sites. 


Discussion 


In recent years, the link between DNA structure and pre-mRNA 
processing has been gaining increasing attention. Reports of increased 
nucleosome occupancy and DNA methylation as well as distinct his- 
tone methylation patterns at exons relative to introns have fuelled the 
hypothesis that exons are differentially marked to aid the spliceosome 
in the process of exon definition'*””. It has further been shown that pol 
II occupancy increases in the vicinity of exons, although whether a 
function of DNA sequence, chromatin structure or the presence of 
DNA-binding proteins has not been defined. Recently, several studies 
have linked modification of distinct histone methylation patterns to 
alternative splicing'*’”. However, exonic histone methylation was 
shown to be equivalent in other models of robust exon inclusion 
versus exclusion®, suggesting that, although histone methylation 
patterns may prime splicing decisions, they probably do so in concert 
with other factors. Consistent with the latter, we observed comparable 
histone methylation at exon 5 whether or not exon 5 was included in 
the CD45 message (Supplementary Fig. 3a, b). Rather, we show that 
mutually exclusive DNA methylation and CTCF binding regulate 
exon 5 inclusion through influencing pol II elongation dynamics 
(Supplementary Fig. 10). Given that mapping of CTCF binding sites 
shows roughly 40-70% conservation between tissues”, it is tempting 
to speculate that altered DNA methylation patterns during develop- 
ment can lead to variations in intragenic CTCF binding that thereby 
contribute to tissue-specific alternative splicing patterns. This may be 
especially relevant in pathological conditions, such as cancer, where 
widespread changes in DNA methylation, altered CTCF binding, and 
aberrant alternative pre-mRNA splicing have been reported’**!. 
We predict that our identification of CTCF as a DNA-binding regu- 
lator of alternative pre-mRNA splicing represents the tip of the ice- 
berg, and that a long list of location-specific DNA-binding ‘splicing 
factors’ will follow. 


Distance (nt) 


against increasing Bayes factor thresholds. *P < 0.05, **P < 0.01, 

*** D < 0.001, Wilcoxon rank sum test for differences in exon inclusion at the 
different thresholds. c, Same as b for BJAB shCTCF compared with wild-type 
BJAB cells (from Fig, 2a). d, Normalized CD4~ T cell RNA pol II read signal 
centred on the alternative exon or the corresponding downstream CTCF peak 
summit. 


METHODS SUMMARY 


Experiments were performed with BJAB and BL41 cells or primary lymphocytes. 
CD45 isoform analysis was achieved with isoform-specific antibodies or pan- 
antibody directed against a common region of CD45. Transductions were exe- 
cuted with vesicular stomatitis virus G (VSV-G)-pseudotyped lentivirus, and 
selected for puromycin resistance. Quantitative RT-PCR was performed on 
cDNA from total RNA. Protein lysates were prepared with RIPA buffer. ChIP 
and MedIP were conducted with formaldehyde cross-linked, sonicated material. 
In vitro transcription elongation was performed with yeast RNA pol II, yeast 
TFIS and human CTCF. Minigenes were cloned into the pCl-neo (Promega) 
construct and transfected with Lipofectamine 2000 (Invitrogen). ChIP-Seq and 
RNA-Seq were executed with the Illumina platform. For ChIP-Seq, Illumina 
FastQ files were mapped to the human genome (hg19). Peak calling was run 
using Rabbit Ig control sequencing data as background. For RNA-Seq, exon 
inclusion levels were determined using the MISO program”. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Cell culture. BJAB and BL41 cells were maintained at 37 °C, 5% CO, in RPMI 
(Invitrogen) supplemented with 10% FBS (Hyclone), and 1% L-glutamine. BJAB 
and parental RB ‘sh BT41 cells were cultured in heat-inactivated FBS, whereas 
RB°™ BLA1 cells were initially kept in native FBS, but were ultimately transitioned 
into inactivated serum. JSL1 cells were maintained at 37 °C, 5% CO, in RPMI 
(Invitrogen) supplemented with 5% FBS (Hyclone), and 1% L-glutamine. Primary 
human peripheral blood lymphocytes were purified by spinning through Ficoll 
Paque (GE Healthcare). Isolated cells were washed twice with PBS and CD3* T 
cells were isolated with CD3* microbeads (Miltenyi Biotech). Primary murine 
splenocytes were isolated from whole spleen of BL/6 mice. Single-cell suspensions 
were lysed with ACK lysis buffer (0.15 M NH,CL, 10mM KHCO3, 0.1mM 
Na EDTA) to remove red blood cells before ChIP assay. JSL1 cells were stimu- 
lated at a concentration of 3 X 10° cells per ml. Phorbol 12-myristate 13-acetate 
(PMA) was added at a final concentration of 20 nM. Flow cytometry was per- 
formed 2 days post-stimulation. 

Virus production. Constructs encoding shRNA directed against CTCF and 
DNMTI1 were obtained from Open Biosystems and were transfected 
(Lipofectamine 2000, Invitrogen) along with VSV-G and gag/pol (courtesy of 
The RNAi Consortium of the Broad Institute) into 293T cells for viral production. 
Viral supernatants were concentrated 50 and aliquoted for storage. 

Cell line infection. BJAB and BL41 cells were plated in 96-well round-bottom 
plates at 100,000 cells per well. Five microlitre of virus and 8 pg ml’ polybrene 
were added per well and the plate was spun at 760g for 90 min. The supernatants 
were removed and fresh media was added. Puromycin was added at a final 
concentration of 5g ml~' on day 2. Depletion of CTCF from cells resulted in 
significant cell death after 1 week in culture and depletion of DNMT1 resulted in 
silencing after 10 days in culture. Cells for downstream analysis were collected 5 
(shCTCE) or 7 (ShDNMT1) days post-infection. To scale up infections for ChIP 
and western blotting, infections were performed in individual wells of 96-well 
plates and pooled before harvesting for RNA, ChIP and western blot. Three plates 
were pooled for shCTCF experiments and 3.5 plates were pooled for shDNMT1 
experiments. Three individual RNA and ChIP samples were taken from each of 
the bulk cultures. 

Target sequences of shRNAs. DNMT1-sh3, 5’-CGAGAAGAATATCGAAC 
TCTT-3’; DNMT1-sh4, 5’-CGACTACATCAAAGGCAGCAA-3’; CTCF-sh3, 
5'-CCTCCTGAGGAATCACCTTAA-3';, CTCF-sh4, 5’-GCGGAAAGTGAA 
CCCATGATA-3’; shRFP, 5'-GAATTAAGAGAGGCTCAGTTA-3’; LL-sh4, 
5'-CGACAGGCTCTAGTGGAATTT-3’. 

Flow cytometry. The following antibodies were used for flow cytometry: CD45RO 
clone UCHLI] (eBioscience, 12-0457-42, batch no. E034572), CD45RA clone MEM- 
56 (ExBio, 1P-223, batch no. 11827), CD45RB clone MT4 (BD Pharmingen, 555904, 
batch no. 89956) and pan-CD45 clone HI30 (BD Pharmingen, 555483, batch no. 
555483). Staining of CD45 isoforms was performed in separate tubes, to avoid 
competition for antibody binding. Flow cytometry was performed on either a BD 
FACSCalibur or BD LSR II cytometer. 

Quantitative RT-PCR. RNA was isolated with the Qiagen RNeasy Mini Kit and 
reverse transcription was performed with SuperScript II (Invitrogen) according 
to the manufacturer’s instructions. PCR measurements were performed in trip- 
licate in the presence of SYBR green reagent (Roche) and amplification was 
performed on a 480 Light Cycler (Roche). The average cycle thresholds for the 
technical triplicates were calculated to yield one value per primer set for each 
biological replicate. Normalization was performed to GAPDH, RPS16 or sur- 
rounding exon level values using the formula 2'Cvormatiation ~ Cesperimensi) to determine 
relative expression. Averages and standard deviations of the normalized bio- 
logical replicate values were plotted in the figures and used in t-test calculations. 
Figure legends indicate the number of biological replicates (individual RNA pre- 
parations) used in each experiment. 

Western blots. Cells were lysed in RIPA buffer (50mM Tris pH 8.0, 150 mM 
NaCl, 1% NP40, 0.1% SDS, 0.5% sodium deoxycholate, and 1X Halt protease 
inhibitor cocktail (Thermo Scientific)). Proteins (35 1g) were loaded per lane ona 
4-20% gradient SDS-PAGE gel. Western blot was performed with anti-CTCF 
clone D31H2(Cell signaling 3418S, batch no. 1), DNMT1 antibody (Abcam 
ab13537, batch no. GR16960-1), or anti-p65 RelA (BD Bioscience 610869, batch 
no. 50886) antibodies. Anti-RelA immunoblotting served as a loading control for 
protein levels. 

Chromatin immunoprecipitation (ChIP). Ten million cells were cross-linked 
for 10 min in 1% formaldehyde (Sigma) at room temperature, and quenched by 
adding glycine to a final concentration of 0.125 M for 5 min at room temperature. 
Cells were washed twice in chilled PBS, resuspended in buffer containing 50 mM 
HEPES-KOH (pH 7.5), 140 mM NaCl, 1 mM EDTA, 10% glycerol, 0.5% NP-40, 
0.25% Triton X-100 and protease inhibitors (Thermo Scientific) and kept on ice 
for 10 min. Nuclei were pelleted at 800g for 5 min at 4°C and resuspended in 
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buffer containing 10 mM Tris-HCl pH 8.0, 200 mM NaCl, 1 mM EDTA, 0.5 mM 
EGTA and protease inhibitors (Thermo Scientific) followed by a 10-min incuba- 
tion on ice. Nuclei were collected and resuspended in sonication buffer containing 
10mM Tri-HCl pH 8.0, 100mM NaCl, 1mM EDTA, 0.5% EGTA, 0.1% Na- 
deoxycholate, 0.5% N-lauryl sarcosine and protease inhibitors (Thermo 
Scientific). Sonication of DNA was performed in an ultra sonicator water bath 
(Bioruptor) using two ten cycle runs of 30 s ‘on’ and 30 s ‘off to achieve an average 
fragment length of 200-400 bp. After addition of 1% Triton X-100, samples were 
centrifuged at 16000g for 10min at 4°C. An aliquot of sonicated DNA was 
reverse-crosslinked and run on a 1% agarose gel to confirm fragment size during 
each ChIP procedure. Chromatin (25 1g) was immunoprecipitated by adding the 
antibody of interest followed by overnight incubation at 4°C. The following 
antibodies were used for ChIP: anti-CTCF (Millipore 07-729, batch no. 
DAM1772428), anti-RNA polymerase II clone 4H8 (Millipore 05-623, batch 
no. DAM1731474), anti-Ser2P RNA polymerase II clone H5 (Covance 
MMS129R, batch no. E10017AF), anti-SersSP RNA polymerase II clone H14 
(Covance MMS134R, batch no. E10142DB), anti-H3K36Me3 (Abcam ab9050, 
batch no.947467), anti-H3K27Me3 (Abcam ab6002, batch no. 934602), anti- 
H3K4Me3 clone MC315 (Millipore 04-745, batch no. NG1717145), Normal 
Rabbit IgG (Cell signaling Technology 2729, batch no. 4), and normal mouse 
IgG (Millipore 12-371, batch no. 1718089). After overnight incubation, 30 ll of 
Dynal Protein A/G beads (Invitrogen) or Protein L magnetic beads (Biovision) 
(for phosphorylated RNA Pol II antibodies) were added and incubated for 1h at 
4 °C. Beads were washed sequentially for 3 min each in low salt (20 mM Tris-HCl 
pH8.0, 150mM NaCl, 2mM EDTA, 0.1% SDS, 1% Triton X-100), high salt 
(20 mM Tris-HCl pH 8.0, 500mM NaCl, 2mM EDTA, 0.1% SDS, 1% Triton 
X-100), LiCl buffer (10 mM Tris-HCl pH 8.0, 0.25 M LiCl, 1% NP40, 1% Na- 
deoxycholate) and TE buffer. Beads were eluted in 150 il elution buffer (50 mM 
Tris-HCl pH 8.0, 10 mM EDTA, 1% SDS, 50 mM NaHCOs) and treated with 1 pl 
RNase A (1 mg ml Ambion) at 37 °C for 30 min. Cross-linking was reversed and 
proteins were degraded by addition of 1 il proteinase K (20mg ml’ Ambion) 
and incubation at 65 °C for 4h. Eluted DNA was purified with QlAquick PCR 
purification (Qiagen), according to the manufacturer instructions. 

Immunoprecipitated DNA and 5% input DNA were analysed by SYBR-Green 

real-time quantitative PCR. PCR measurements were performed in duplicate. 
The average cycle thresholds for the technical replicates were calculated to yield 
one value per primer set for each biological replicate and normalized to input 
using the formula 2'Cisrs ~Cinmsnopeespiaion, These values were further normalized 
relative to the rabbit or mouse Ig control IP values for the primer set. Averages 
and standard deviations of the normalized biological replicate values were plotted 
in the figures and used in f-test calculations. Figure legends indicate the number 
of biological replicates (individual IPs) used in each experiment. 
Methylated DNA immunoprecipitation (MedIP). MedIP was performed essen- 
tially according to the protocol described in ref. 44. Genomic DNA was purified 
from approximately 25 million cells using Zymo research Quick gDNA Midiprep 
kit (D3100), according to the manufacturer’s instructions. For primary cells, 
CD3* T cells were isolated from peripheral blood using CD3 microbeads 
(Miltenyi Biotech). CD3°* T cells were sorted into CD45RB high and CD45RB 
medium populations based on surface receptor staining of CD45RB and 
CD45RO. Purified genomic DNA was diluted into a total of 300 yl TE buffer 
and sonicated with a Bioruptor (10 cycles at low power, of 30s ‘on’ and 30s ‘off) 
to an average size of 300-500 bp. An aliquot of sonicated DNA was run on 1% 
agarose gel to confirm fragment size during each MedIP procedure. Sonicated 
DNA (411g; 3 ug for primary cells) was denatured by incubation at 95 °C for 
10min and was immediately transferred to ice for 10min. Immuno- 
precipitation buffer containing 10 mM sodium phosphate, 140 mM NaCl and 
0.05% Triton X-100 was added to a final volume of 500 pl. For each IP reaction, 
10 tg (8 jg for primary cells) of 5-methyl cytidine antibody clone b (Diagenode 
MAb-006-100, batch no. DA-0018) was added and incubated overnight at 4 °C 
with shaking. Five percent of DNA was kept as input. 

After incubation, 30 pl of Dynal Protein G beads were added and further 
incubated for 1h at 4°C. Beads were washed thrice with 500 ul of IP buffer. 
Elution buffer (150 pl) containing 50 mM Tris-HCl pH 8.0, 10 mM EDTA, 1% 
SDS, 50 mM NaHCO; and 20 pg proteinase K was added and incubated at 55 °C 
for 3h. Tubes were applied to a magnetic rack and eluted DNA and input DNA 
were purified with the Qiaquick PCR purification kit (Qiagen) followed by SYBR- 
Green real-time quantitative PCR to identify methylated regions. PCR measure- 
ments were performed in duplicate. The average cycle thresholds for the technical 
replicates were calculated to yield one value per primer set for each biological 
replicate and normalized to input using the formula 2' Cinput ~ Crnsnunoprsiptaton ) 
Averages and standard deviations of the normalized biological replicate values 
were plotted in the figures and used in t-test calculations. Figure legends indicate 
the number of biological replicates (individual IPs) used in each experiment. 
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In vitro transcription elongation assay. RNA pol II from yeast Saccharomyces 
cerevisiae containing a histidine-tagged Rpb3 subunit was purified as described 
previously**. Histidine-tagged TFIIS expression plasmid*® was a gift from C. 
Kane. Recombinant TFIIS was purified according to ref. 46, with an additional 
purification on a Mono-S column (GE Helthcare). Human CTCF recombinant 
protein was obtained from Abnova (catalogue no. H00010664-P01, batch no. 
0991020-2). 

Elongation complex incorporating a 9-nt RNA was assembled as described 
previously”, purified with Amicon Ultra-0.5 ml centrifugal filter (Millipore), and 
diluted with transcription buffer (TB; 20mM Tris-HCl pH 7.9, 5mM MgCL, 
10mM 2-mercaptoehanol, 40mM KCl, 0.1 mgm! BSA). The reaction was 
initiated by mixing 5 pl of TEC +/— XM CTCF with 5 ul of 0.1-0.5mM 
NTP (GE Healthcare) +/— 11M TFIIS in TB and was terminated with gel- 
loading buffer (5M urea, 25 mM EDTA at final concentration). RNA products 
were resolved in 20% denaturing polyacrylamide gels and visualized with a 
Typhoon 8600 phosphoimager (GE Helthcare). 

Oligonucleotides used for elongation complex. Sequences of RNA and DNA 
oligonucleotides are as follows. RNA, 5’-AUCGAGAGG-3’; DNA with CTCF 
binding site, non-template strand, 5’-GGTATAGGATACTTACAGCCATCGA 
GAGGGACAAGGCGAAAGCATCCACCAGGGGGCGCCAGCTAAT-3’; tem- 
plate strand, 5’-ATTAGCTGGCGCCCCCTGGTGGATGCTTTCGCCTTGTCC 
CTCTCGATGGCTGTAAGTATCCTATACC-3’. 

Electrophoretic mobility shift assay (EMSA). The CTCF-binding oligonucleo- 
tides used for EMSA correspond to either the template used for in vitro transcrip- 
tion (Supplementary Fig. 6a) or the CTCF binding sites in the wild-type and 
mutated 13-17 minigenes (Supplementary Fig. 6d). The two strands of DNA were 
annealed, 5’ end-labelled with [y-*’P] ATP and purified with a G-50 Micro col- 
umn (GE healthcare). DNA probe (3 pM) equalling approximately 70,000 c.p.m. 
was mixed with glutathione S-transferase (GST)-tagged CTCF in binding buffer 
containing PBS and 5mM MgCh, 0.1mM ZnSO,, 1mM DTT, 0.1% NP40 and 
10% glycerol. EMSA reaction mixtures (20 ll final volume) were incubated for 
20 min at room temperature followed by electrophoresis on 5% native polyacry- 
lamide gels and visualized as described above for in vitro transcription. 

EMSA DNA probe sequences. In vitro transcription probe, 5’-CATCCACCAG 
GGGGCGCCAGCTAAT-3’ and 5'-ATTAGCTGGCGCCCCCTGGTGGATG-3’; 
wild-type exon 5 probe, 5’-TCAGTTCCAGCAGAGGGCGTCTGCG-3’ and 
5'-CGCAGACGCCCTCTGCTGGAACTGA-3’; mutated exon 5 probe, 5’-TCAG 
TTAAAGCTGAGTACGTCTGCG-3’ and 5’-CGCAGACGTACTCAGCTTTAA 
CTGA-3’. 

ChIP-Seq analyses. Illumina FastQ files were mapped to the human genome 
(hg19) using Bowtie requiring a unique match (using the ‘-m I’ flag). The 
aligned reads in SAM format were converted to BED format before running 
the MACS peak caller*’. MACS peak calling was run using the rabbit Ig control 
sequencing data as background files for BJAB and BL41 CTCF ChIP-Seq data, 
respectively. The number of peaks identified per ChIP-Seq sample and sequenced 
reads are listed in Supplementary Table 2. 

To investigate the effects of CTCF binding upon pre-mRNA splicing, we 
compared CTCF ChIP-Seq peaks with a set of alternative exons’ requiring that 
the CTCF peak summit was located within 1,000 bp of the alternative exon 
boundaries (see Fig. 5a). Based on the presence of local CTCF peaks in BL41, 
BJAB and CD4 we classified each alternative exon into exons that were unbound 
by CTCF, and exons with either downstream or upstream CTCF binding. 
Classified unbound exons lacked CTCF peak summits in both the alternative 
exon body and within 1,000 bp on either side of the alternative exon. Exons with 
downstream CTCF binding had one or more CTCF peak summits within the 
region spanning from the alternative exon 5’ splice site and 1,000 bp downstream 
in one or more of the CTCF data in BJAB, BL41 or CD4. Any alternative exons 
with a downstream CTCF peak but additional peak summits in the upstream 
region or within the alternative exon were not considered. The reciprocal pro- 
cedure was used to classify exons with upstream CTCF binding. Alternative exons 
classified by local CTCF binding together with exon inclusion levels are provided 
in Supplementary Table 4. 

RNA-Seq analyses. Illumina FastQ files were mapped to the human genome 
(hg19) and a collection of junctions using Tophat version 1.1.4 (ref. 50), using 
the paired-end mode and requiring a unique match. The resulting SAM file with 
uniquely mapped reads was converted to BAM format using samtools*’. Mapping 
statistics for the RNA-Seq data are provided in Supplementary Table 3. We 
estimated exon inclusion levels of a collection of 42,557 alternative exons 


(approximately the same as in ref. 1) using the MISO program” with the default 
parameters using the ‘compute-genes-psi’ function. The estimated exon inclusion 
levels from different RNA-Seq experiments were compared using MISO function 
“‘compare-samples’ to obtain exon inclusion level differences and Bayes factors. 
Statistically significant differences in exon inclusion levels between CTCF-bound 
and -unbound exons at different thresholds were evaluated using the Wilcoxon 
rank sum test. The overall difference in gene expression across RNA-Seq samples 
was evaluated using singular value decomposition. First we computed the 
expression level of each Refseq transcript as reads per kilobase and million map- 
pable reads using the rpkmforgenes program”. The full gene expression matrix 
was normalized to unit length per transcript and subsequently used as input for 
singular value decomposition using the svdman program’’. The result in 
Supplementary Fig. 8c was obtained by projecting each sample onto the first 
two ‘eigenarrays”™*. 

RNA polymerase II ChIP-Seq analysis. We generated normalized RNA poly- 
merase II fold enrichment signals over a set of regions by dividing the observed 
read sum at each position with the expected read sum computed as: (total_reads * 
read_length * number_of_regions) / genome_length. The normalized fold enrich- 
ment at each position was smoothened by window averaging using a window size 
of 100 nucleotides. We analysed all alternative exons with a downstream CTCF 
peak summit conserved in CD4* T cells (Supplementary Table 4) after removing 
exons with a CTCF peak within 1,000 bp of an annotated transcript start site or 
poly A site in Ensembl (to remove effects from strong RNA pol II signals at 
transcript start and end locations). This procedure rendered 408 exons from which 
we computed both the CTCF peak summit position and exon middle coordinate. 
These two sets of 408 genomic coordinates each were used as the centre for the 
analysis in Fig. 5d. The same procedure was used to generate CTCF peak summits 
and middle exon positions for alternative exons with a conserved CTCF peak in the 
upstream region for Supplementary Fig. 8e (number of exons identified was 416). 
Minigenes and transfection. CD45 minigenes were cloned into the pCl-neo 
mammalian expression vector (Promega). The wild-type CD45 minigene consists 
of 9.7 kb of CD45 genomic DNA sequence extending from 2.4 kb of intron 3 through 
588 bp of intron 7. The 13-I7*CTCF minigene was made by mutating the CTCF 
binding site of exon 5 using site-directed mutagenesis with the primers indicated in 
Supplementary Table 1. The minigenes were transfected (Lipofectamine 2000, 
Invitrogen) along with pC1-neo vector control into HEK293 cells, CHO cells and 
NIH-3T3 cells. Cells were collected 48h after transfection for RNA isolation 
(RNeasy, Qiagen) and chromatin immunoprecipitation. Transfection was per- 
formed in triplicate for HEK293 and CHO cells, with an individual RNA prepara- 
tion (HEK293 and CHO) and duplicate ChIPs (CHO) derived from each of the 
three dishes. Transfection was performed in a single dish for NIH3T3 cells with 
three individual RNA preparations and triplicate ChIPs derived from the one dish. 
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Low-Mach-number turbulence in interstellar gas 
revealed by radio polarization gradients 


B. M. Gaensler!, M. Haverkorn??*, B. Burkhart®, K. J. Newton-McGee"®, R. D. Ekers®, A. Lazarian®, N. M. McClure-Griffiths®, 


T. Robishaw’, J. M. Dickey’ & A. J. Green! 


The interstellar medium of the Milky Way is multiphase’, magnetized” 
and turbulent®. Turbulence in the interstellar medium produces a 
global cascade of random gas motions, spanning scales ranging 
from 100 parsecs to 1,000 kilometres (ref. 4). Fundamental 
parameters of interstellar turbulence such as the sonic Mach 
number (the speed of sound) have been difficult to determine, 
because observations have lacked the sensitivity and resolution to 
image the small-scale structure associated with turbulent motion*”. 
Observations of linear polarization and Faraday rotation in radio 
emission from the Milky Way have identified unusual polarized 
structures that often have no counterparts in the total radiation 
intensity or at other wavelengths*’, and whose physical signifi- 
cance has been unclear'*"*. Here we report that the gradient of 
the Stokes vector (Q, U), where Q and Uare parameters describing 
the polarization state of radiation, provides an image of magnetized 
turbulence in diffuse, ionized gas, manifested as a complex fila- 
mentary web of discontinuities in gas density and magnetic field. 
Through comparison with simulations, we demonstrate that tur- 
bulence in the warm, ionized medium has a relatively low sonic 


Galactic latitude (deg) 


Mach number, M, < 2. The development of statistical tools for 
the analysis of polarization gradients will allow accurate deter- 
minations of the Mach number, Reynolds number and magnetic 
field strength in interstellar turbulence over a wide range of 
conditions. 

We consider radio-continuum images of an 18-deg” patch"! of the 
Galactic plane, observed with the Australia Telescope Compact Array 
(ATCA) at a frequency of 1.4 GHz. Data were simultaneously recorded 
in total intensity (Stokes parameter J) and in linear polarization (Stokes 
parameters Q and U). The Stokes I image (Fig. 1) shows a typical 
distribution of radio emission, consisting of supernova-remnant shells, 
ionized regions around massive stars (Hu regions) and unresolved 
distant radio sources. However, the corresponding images of Q, U 
and the linearly polarized intensity P= (Q’ + U’)'” in Fig. 1 are filled 
with complex structure that bears little resemblance to the Stokes I 
image, as has also been seen in many other polarimetric observations at 
radio frequencies*””*. The intensity variations seen in Q, U and P are 
the result of small-scale angular structure in the Faraday rotation 
induced by ionized gas*, and are thus an indirect representation of 
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Figure 1 | Total intensity (J) and linearly polarized intensity (Q, U, P) for an 
18-deg’ region of the Southern Galactic Plane Survey”. All four images were 
generated" from a set of observations'® taken at the ATCA over the period 1997 
April to 1998 April using a 96-MHz bandwidth centred on an observing 
frequency of 1,384 MHz. The field is a mosaic of 190 pointings each with a total 
integration time of 20 min, resulting in an approximately uniform sensitivity, 
over most of the field, of 0.8 mJy per beam (Stokes J) or 0.55 mJy per beam (Stokes 
Qand U) at an angular resolution of 75 arcsec (1 Jy = 10 Wm 7Hz_?). 
The scale for each image is shown on the right of each panel. The Stokes I 
image is displayed over a range of —40 to + 150 mJy per beam (each interval 
corresponds to 10 mJy per beam). Because the ATCA is an interferometer, it is 


not sensitive to structure on angular scales larger than 35 arcmin. Faint wisps 
can be seen, corresponding to the sharp edges of large-scale structures. 
However, the bulk of the smooth radio emission from Galactic cosmic rays is 
not detected. Imaging artefacts in the form of grating rings and radial streaks 
can be seen around a few very bright sources, but these regions were not used in 
our statistical analysis. The Stokes Q and Uimages are displayed over a range of 
—15 to +15 mJy per beam (interval, 2 mJy per beam), and the P image covers a 
range of 0 to 15 mJy per beam (interval, 1 mJy per beam). Almost none of the 
structure seen in Q, Uand Phas any correspondence with any emission seen in 
Stokes J; the mottled structure results from spatial fluctuations in Faraday 
rotation in the ISM. 
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turbulent fluctuations in the free-electron density and magnetic field 
throughout the interstellar medium’ (ISM). 

A limitation of previous studies is that they usually interpreted 
the data in terms of the amplitude, P, and/or the angle, 
0=(1/2)tan~'(U/Q), of the complex Stokes vector P=(Q, U). 
However, neither polarization amplitude nor polarization angle is 
preserved under arbitrary translations and rotations in the Q-U plane. 
These can result from one or more of a smooth distribution of inter- 
vening polarized emission, a uniform screen of foreground Faraday 
rotation, and the effects of missing large-scale structure in an inter- 
ferometric data set. In the most general case, we are thus forced to 
conclude that the observed values of P and 0 do not have any physical 
significance, and that only measurements of quantities that are both 
translationally and rotationally invariant in the Q-U plane can provide 
insight into the physical conditions that produce the observed polar- 
ization distribution. 

The simplest such quantity is the spatial gradient of P, that is, the 
rate at which the polarization vector traces out a trajectory in the Q-U 
plane as a function of position on the sky. The magnitude of the 
gradient is unaffected by rotation and translation, and so has the 
potential to reveal properties of the polarization distribution that 
might otherwise be hidden by excess foreground emission or 
Faraday rotation, or in data sets from which large-scale structure is 
missing (as is the case for the data shown in Fig. 1). The magnitude of 
the polarization gradient is 


vni=y|(%2)'+() +(%) (2) 


The expression in equation (1) can be calculated simply, and the 
corresponding image of |VP| (Fig. 2) reveals a complex network of 
tangled filaments. In particular, all regions in which |VP| is high con- 
sist of elongated, narrow structures rather than extended patches. In 
the inset of Fig. 2, we plot the direction of VP for a small subregion of 
the image, demonstrating that VP changes most rapidly along direc- 
tions oriented perpendicular to the filaments. We can explore the 
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Figure 2 |VP| for an 18-deg” region of the Southern Galactic Plane Survey. 
|VP| has been derived by applying equation (1) to the Q and U images from 
cannot be constructed from the scalar quantity 

=(Q° + U*)"”, but is derived from the vector field P = (Q, U). |VP| isa 
gradient in one dimension, for which the appropriate units are (beam) °° 
Because P measures linearly polarized intensity in units of millijanskys per 
)'°. The scale showing |VP] is 
shown on the right of the image, and ranges from 0 to 15 mJy per (beam)'”. The 
, covering a 
box of side 0.9 deg centred on Galactic longitude 329.8 deg and Galactic latitude 
+1.0 deg. Plotted in the inset is the direction of VP at each position, defined as 
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For clarity, vectors are shown only at points where the amplitude of the gradient 
is greater than 5 mJy per (beam)'* 
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frequency dependence of these filaments, because the 1.4-GHz 
ATCA data shown in Fig. 1 consist of nine independent spectral 
channels of width 8 MHz, spread over a total bandwidth of 96 MHz. 
We have constructed images of |VP| for each individual spectral 
channel, and these show the same set of specific features as in the 
overall image, albeit at reduced signal-to-noise ratios. The lack of 
frequency dependence indicates that the high-gradient structures seen 
in this data set correspond to physical features in the ISM rather than 
to contour lines introduced by the particular combination of observing 
frequency and angular resolution used'*”. 

We first consider the possibility that these filaments of high gradient 
are intrinsic to the source of emission. Abrupt spatial transitions in the 
strength or geometry of the magnetic field in a synchrotron-emitting 
region would generate a large gradient in (Q, U). However, processes of 
that sort would also produce structure in the overall synchrotron 
emissivity, such that we would observe features in the image of 
Stokes I that match those seen in |VP|. No such correspondence is 
observed, demonstrating that the regions of high polarization gradient 
are not intrinsic to the source of polarized emission but must be 
induced by Faraday rotation in magneto-ionized gas. 

Because the amount of Faraday rotation is proportional to the line 
integral of n,B, from the source to the observer (where n, is the density 
of free electrons and B is the uniform component of the line-of-sight 
magnetic field), the filamentary structure seen in |VP| must corre- 
spond to boundaries across which n, and/or B| showa sudden increase 
or decrease over a small spatial interval. Such discontinuities could be 
shock fronts or ionization fronts from discrete sources, as have been 
observed in polarization around the rims of supernova remnants, H 1 
regions and planetary nebulae'’'*. We have examined this possibility 
by carefully comparing our image of |VP| with images and gradient 
images of Stokes J (tracing shock waves seen in synchrotron emission; 
ref. 11),21-cm H remission” (tracing atomic hydrogen) and 656.3-nm 
Ha emission’’”® (tracing ionized hydrogen) over the same field, but do 
not find any correspondences. 

We conclude that the features seen in | VP| are a generic component 
of diffuse, ionized gas in this direction in the sky. To test this hypo- 
thesis, we performed a series of three-dimensional isothermal simula- 
tions of magnetohydrodynamic turbulence in the ISM, each with 
different parameters for the sonic Mach number, defined as 
M; = ({v|/cs), where v is the local velocity, c, is the sound speed and 
the averaging (indicated by angle brackets) is done over the whole 
simulation. For each simulation, we propagated a uniform source of 
polarized emission through the distribution of turbulent, magnetized 
gas. The resultant Faraday rotation produces a complicated distri- 
bution on the sky of Stokes Q and U, from which we generated a 
map of the polarization gradient using equation (1). Images of |VP| 
for representative simulations of the subsonic, transonic and super- 
sonic regimes are shown in Fig. 3. Narrow, elongated filaments of high 
polarization gradient are apparent in each simulation in Fig. 3, 
although they differ in their morphology and degree of organization. 
In particular, the supersonic case (Fig. 3c) shows localized groupings of 
very high-gradient filaments, corresponding to ensembles of intersect- 
ing shocks*?'’*, By contrast, the subsonic (Fig. 3a) and transonic 
(Fig. 3b) cases show more-diffuse networks of filaments, representing 
the cusps and discontinuities characteristic of any turbulent velocity 
field®?!3, 

Visual comparison of the simulated distributions of |VP| with real 
data (Fig. 2) suggests that the subsonic and transonic cases shown in 
Fig. 3a, b more closely resemble the observations than does the super- 
sonic case. We can quantify this statement by calculating the third- 
order moment (skew, 7’) and the fourth-order moment (kurtosis, f) of 
the probability distribution function of |VP| for both observations and 
simulations: these quantities parameterize the degree of Gaussian 
asymmetry in the probability distribution function, and hence provide 
information on the amount of compression due to shocks in the 
data®**, 
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Figure 3 |VP| derived from propagation of linear radio polarization 
through three different isothermal simulations of magnetized turbulence. 
Each simulation is a 512 X 512 X 512-element periodic box with a linear 
dimension of 0.15 pc for each pixel, evolved in time using an essentially non- 
oscillatory scheme**”*. Three such simulations are shown, each labelled with its 
corresponding value for Mg: subsonic (M, < 1; a), transonic (M, ~ 1; b) and 
supersonic (M, > 1; c). At the start of each simulation, the electron density had 
a uniform value n, = 0.1 cm ° and the magnetic field was oriented in the plane 
of the sky with a uniform amplitude of B = 0.3 1G (subsonic), 1 ,1G (transonic) 
or 2 1G (supersonic), which corresponds to a constant Alfvénic Mach number 
of Ma = 2 in each case. Turbulence was driven solenoidally in Fourier space at 
large scales (small wavenumber) until the turbulent cascade had fully developed 
and a steady state between input energy and dissipation had been reached. In 
each case, we illuminated the simulation volume with a background radio 
source of uniform polarization at an emission frequency of 1.4 GHz, with 
Q= 100 mJy per pixel and U = 0 mJy per pixel at all positions. At each pixel, the 
line integral of n,B|, was computed, and the corresponding Faraday rotation 
was applied to the polarized signal, to calculate values of Q and U. No effects due 
to finite angular resolution, depolarization or incomplete interferometric 
visibility coverage were included, so the observed polarized signal is 
P= 100 mly per pixel at all positions. We then calculated the gradient, 
| VP|, using equation (1). The scales showing | VP| are shown on the right of the 
images, and range from 0 to 25 mJy per (pixel) for a (interval, 1 mJy per 
(pixel) 15) 0 to 100 mJy per (pixel)'* for b (interval, 10 mJy per (pixel)'») and0 
to 500 mJy per (pixel)'” for c (interval, 100 mJy per (pixel)'*). 


In the simulations, we found that both the skew and the kurtosis of 
|VP| increased monotonically with sonic Mach number. We used a 
genetic algorithm” to determine that the threshold for strongly super- 
sonic turbulence was y>1.1 and f> 1.5. We then computed the 
third- and fourth-order moments for the observed distribution of 
|VP| shown in Fig. 2, and found that y = 0.3 and f = 0.9. 

This analysis of the moments of the polarization gradient therefore 
confirms quantitatively what we concluded above from visual inspec- 
tion: the turbulent, ionized ISM in this direction in the sky is subsonic 
or transonic. The findings we obtained by imaging the polarization 
gradients produced by interstellar turbulence are supported by recent 
statistical studies of Ha emission measures and of 21-cm H1 column 
densities over large volumes, which have similarly found that M, < 2 
for warm gas throughout the ISM’*’’. 

In the simulations shown in Fig. 3, the sharp gradients in (Q, U) 
occur asa result of localized high values of the gas density and magnetic 
field, resulting from vorticity or shock compression. However, the 
filamentary features seen in |VP| may not be easily observable in other 
types of data: for example, if we adopt typical parameters for warm, 
ionized gas!* of n, ~ 0.3cm * and B) ~ 2 pG, even the compression 
associated with a strong adiabatic shock produces across-filament 
changes in emission measure and Faraday rotation measure of only 
~0.5pccm™° and < 5radm’, respectively, assuming a spatial 
scale”"* for these structures of ~0.5 pc. This is below observable levels 
in Ha and other tracers of emission measure. The rotation measure 
gradient™ across these interfaces is potentially observable in spectro- 
polarimetric radio data, but the addition of single-dish observations is 
required to recover the total power of the polarized signal. By contrast, 
even a small gradient in rotation measure can produce an arbitrarily 
large value of |VP| (irrespective of whether single-dish measurements 
are present in the data), provided that there is a strong source of 
background polarized emission through which the discontinuities in 
Faraday rotation are viewed. Further investigation of the polarization 
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gradient and its statistical properties will provide robust estimates of 
poorly constrained parameters of turbulent flows such as the sonic and 
Alfvenic Mach numbers, the characteristic magnetic field strength, the 
Reynolds number and the physical scale of energy injection. 
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Crystal structure of a bacterial homologue of the bile 
acid sodium symporter ASBT 


Nien-Jen Hu!*?, So Twatab?3-49, Alexander D. Cameron!*** & David Drew! 


High cholesterol levels greatly increase the risk of cardiovascular 
disease. About 50 per cent of cholesterol is eliminated from the 
body by its conversion into bile acids. However, bile acids released 
from the bile duct are constantly recycled, being reabsorbed in the 
intestine by the apical sodium-dependent bile acid transporter 
(ASBT, also known as SLC10A2). It has been shown in animal 
models that plasma cholesterol levels are considerably lowered by 
specific inhibitors of ASBT’”, and ASBT is thus a target for hyper- 
cholesterolaemia drugs. Here we report the crystal structure of 
a bacterial homologue of ASBT from Neisseria meningitidis 
(ASBTym) at 2.2 A. ASBTym™m contains two inverted structural 
repeats of five transmembrane helices. A core domain of six helices 
harbours two sodium ions, and the remaining four helices pack ina 
row to form a flat, ‘panel’-like domain. Overall, the architecture of 
the protein is remarkably similar to the sodium/proton antiporter 
NhaA’, despite having no detectable sequence homology. The 
ASBT ym structure was captured with the substrate taurocholate 
present, bound between the core and panel domains in a large, 
inward-facing, hydrophobic cavity. Residues near this cavity have 
been shown to affect the binding of specific inhibitors of human 
ASBT“. The position of the taurocholate molecule, together with 
the molecular architecture, suggests the rudiments of a possible 
transport mechanism. 

ASBT is an SLC10 (sodium bile acid co-transporter family) member 
that moves bile acids across the apical membrane of the ileum into the 
portal blood vein**. ASBT uses the sodium ion gradient to drive the 
‘uphill transport of bile acids across membranes, with a reported 
stoichiometry of two sodium ions per substrate’. Mutations in the 
human ASBT gene cause a condition of primary bile acid malabsorp- 
tion®. ASBT is a pharmaceutical target for drugs aimed at lowering 
cholesterol, and several ASBT inhibitors have been developed that are 
effective in animal models'”. Because some drugs are poorly absorbed 
in the intestine or need to be targeted to the liver, ASBT and its close 
liver paralogue, NICP (SLC10A1), have also received attention as 
prodrug carriers, capable of transporting various compounds coupled 
to bile acid, for example HMG-CoA reductase (HMGCR) inhibitors, 
the antiviral drug acyclovir, nucleotides and cytostatic drugs’. 

ASBTym from N. meningitidis, with 26% identity and 54% similarity 
to human ASBT, was identified by fluorescence-based screening 
methods’*” as a suitable candidate for structural studies (Supplemen- 
tary Figs 1 and 2). Residues known to be functionally important in 
mammalian ASBT and other SLC10 members” are well conserved in 
ASBTym (Supplementary Fig. 1). Bile acid transport by ASBT\y was 
confirmed in whole cells by the sodium-dependent uptake of [?H]- 
taurocholate (Fig. 1a). The observed Michaelis constant, K,,, for PH]- 
taurocholate is ~50 11M (Fig. 1b), a value similar to that measured for 
rat and human ASBT”’*"*. The ASBT inhibitors cyclosporin A’* and 
bromosulfophthalein** and the drug fluvastatin’® are also competitors 
for ASBTyx-mediated [*H]-taurocholate transport (Fig. 1c). Thus, 
ASBT ym is a valid model of mammalian bile acid transporters. We 


solved the structure of ASBT yyy by single-wavelength anomalous scat- 
tering and refined it at a resolution of 2.2 A (Supplementary Tables 1 
and 2; Methods). 

ASBTym has cytoplasmic amino and carboxy termini, comprises 
ten transmembrane helices (TMs) that are linked by short loops, and 
has overall dimensions of approximately 45 A x 30 A X 30A (Fig. 2 
and Supplementary Fig. 3). TM1 to TM5 and TM6 to TM10 are 
topologically similar but oppositely orientated in the plane of the 
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Figure 1 | Sodium-dependent transport of bile acid by ASBT yyy. a, Time- 
dependent uptake of [°H]-taurocholate after expression of ASBTyyy in 
Escherichia coli, as monitored in buffer containing 137 mM sodium (filled 
circles) or <1 mM sodium (open circles) b, Michaelis-Menten transport 
kinetics of ASBTym-mediated [*H]-taurocholate uptake. The specific uptake 
(filled circles) was calculated by subtracting the internalization measured from 
control cells lacking the transporter (open squares) from the total uptake (open 
circles), as detailed in Methods. c, ASBTyyy-mediated [*H]-taurocholate uptake 
after 5 min in the presence of 150 UM taurocholate, cyclosporin A, fluvastatin or 
bromosulfophthalein (black bars), measured as a tein of the uptake 
without their addition (white bar). d, ASBTyyy-mediated [°H]-taurocholate 
uptake after 5 min for a wild-type control (white bar) and the indicated single 
alanine point mutants (black bars). The uptake for the mutants is displayed as a 
percentage of the wild-type activity. The expression and detergent-solubilized 
folded state of each mutant was similar to wild-type protein (Supplementary 
Fig. 2a). Errors bars, s.e.m.; 1 = 3. 
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Figure 2 | ASBT yy structure. a, Ribbon representation of ASBT yyy as viewed 
in the plane of the membrane. TM1 to TM10 have been coloured from red at 
the N terminus to blue at the C terminus, and the position of the membrane is 
depicted in grey. The pink spheres indicate sodium sites Nal and Na2, and the 
stick model represents the substrate taurocholate. b, ASBT yy structure as 
viewed from the intracellular side as a ribbon representation (left) and as a 
simplified cartoon (right). On the right, the burgundy pentagon represents 
taurocholate. 


membrane. The root mean squared deviation (r.m.s.d.) after super- 
position of the two topology-inverted repeats is 3.7 A (Supplementary 
Fig. 4a, b; Methods). Each repeating unit is made of an N-terminal 
V-motif (TM1 and TM2, TM6 and TM7) and a core motif of three 
helices (TM3 to TM5, TM8 to TM10) (Fig. 2 and Supplementary Figs 3 
and 4). Ifthe V-motif and the core motif are superposed separately, the 
r.m.s.d. in each case is lower: 2.6A and 2.8 A, respectively (Sup- 
plementary Fig. 4c). The core motifs from each repeat form the core 
domain, whereas the two V-motifs create a panel-like domain 
(Fig. 2b). TM4 and TM9 in the core domain are broken in the middle 
(discontinuous) and form helical hairpins with TM5 and TM10, 
respectively, which are both kinked. At the point where TM4 and 
TM9 are broken by well-conserved peptide motifs, they cross over 
(Fig. 2 and Supplementary Figs 5 and 6). On the intracellular side, a 
wide crevice separates the core domain from the panel domain 
(Fig. 3a). The cavity extends over halfway through the protein. The 
extracellular side of the cavity is tightly closed by TM1, TM2, TM4b, 
TM7, TM9b and TM10. Previously, two topology models of ASBT 
were proposed with seven and nine transmembrane helices, respec- 
tively’”"*. Because TM1 is not conserved in ASBT, the structure is 
broadly consistent with the model with nine transmembrane helices 
(Supplementary Fig. 5). TM4 and TM9 were annotated as extracellular 
loops in the topology model with seven transmembrane helices, but 
were correctly identified in the model with nine transmembrane helices. 

Discontinuous transmembrane helices are a common motif in 
secondary active transporters”’’”°. However, the sodium/proton anti- 
porter NhaA is the only other known example in which these helices 
cross as observed in ASBTym (Supplementary Fig. 6). Indeed, 
ASBTym has a similar structure to NhaA, and they superpose with 
an r.ms.d. of 2.9A over 202 Cx atoms (Supplementary Fig. 7a; 
Methods). The similarity is more striking when the core and panel 
domains are superposed separately (Supplementary Fig. 7b). This 
unexpected finding further emphasizes the remarkable plasticity of 
transporters, which allows them to use a common scaffold to translo- 
cate different substrates”. 
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Figure 3 | ASBT yy structure is inward facing and contains bound sodium 
and bile acid. a, Surface representation showing the location of the 
intracellular cavity in which taurocholate binds, as a section through the 
protein. b, The sodium-binding sites in ASBT yy. Nal is octahedrally 
coordinated by Ser 114 and Asn 115 on TM4b, Thr 132 and Ser 128 on TMS, 
and Glu 260 on TM9a. The square pyramidal arrangement of the Na2 ligands is 
made up of Glu 260, Val 261, Met 263 and Gln 264 on TM9 and Gln 77 on TM3. 
c, The intracellular cavity in ASBT ym. Residues lining the cavity and near to the 
taurocholate are shown. The figures have been coloured as in Fig. 2. A 150-fold 
difference in inhibition of the mouse and human forms of ASBT by 
benzothiazepines* has been ascribed to sequence differences corresponding to 
Ser 291 at the bottom of the cavity. Supplementary Fig. 10 shows stereo versions 
of b and c. 


In ASBT and NTCP, two sodium ions are translocated per bile acid 
molecule’'. In the highly conserved core domain of ASBTNm 
(Supplementary Fig. 8), we have identified two sodium-binding sites 
(Nal and Na2) on the basis of the coordination and bond distances 
(2.0-2.5 A) (Fig. 3b and Supplementary Figs 9a and 10a; Methods). 
Nal is located approximately 10A from the cytoplasmic surface 
between TM4b and TM5, but also interacts with the carboxylate moiety 
of Glu 260 on TM9a (Fig. 3b and Supplementary Fig. 10a). The Na2 site 
is located 8 A from Nal, near the centre at the crossover points of TM4a 
with TM4b and TM9a with TM9b. Four backbone carbonyl oxygen 
atoms coordinate Na2, including Glu 260 on TM9a and the side chains 
of Gln 264 on TM9a and Gln 77 on TM3. The residues for which the 
side chains interact with the two sodium ions are completely conserved 
in ASBT and NTCP (Supplementary Figs 5 and 8). The glutamate 
residue equivalent to Glu 260 is essential for activity in ASBT and 
NTCP. In ASBTym; its replacement with alanine significantly 
affects transport, as does the mutation of Gln 77 to alanine (Fig. 1d 
and Supplementary Fig. 2a). Thus, it seems that both sodium ions are 
required for efficient transport. Mechanistically, it is almost certainly 
necessary for sodium to be present at the Na2 site to neutralize the 
partial negative dipole of TM9a and, by doing so, stabilize the 
interaction with TM4a. Neutralization of the helix dipoles seems a 
conserved feature for this fold. In NhaA, the corresponding trans- 
membrane helix is thought to be neutralized by the positive charge 
of Lys 300, which is essential for transport*”’. 

The substrate-binding cavity is open to the cytoplasm and is 
approximately 6A X 12AX 14A in size, with a solvent-accessible 
volume of 550 A® (Fig. 3a; Methods). Because the N-terminal half of 
TM1 is markedly bent outwards, it is more open on one side. The 
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cavity is much bigger than taurocholate, perhaps reflecting the large 
variety of compounds that are recognized by ASBT”’*"* (Fig. 3a, c). Itis 
predominantly hydrophobic, but near the bottom there are a number 
of polar residues and water molecules (Fig. 3c and Supplementary 
Fig. 10b). As judged from high B-factors, taurocholate seems to be 
weakly bound (Supplementary Table 2 and Supplementary Fig. 9b). 
Consistent with this observation, there is only one direct hydrogen 
bond between ASBTy and taurocholate, from Asn 295 on TM10 to 
the 7% hydroxyl group. The mutation of Asn 295 to alanine causes a 
dramatic reduction in taurocholate transport (Fig. 1d and Supplemen- 
tary Fig. 2a). Water molecules bridge the 7 hydroxyl with His 294 and 
the 3a hydroxyl with Asn 265, which is located at the crossover region 
of TM9. Thr 112 is also in the vicinity of the 3~ group but cannot be 
unambiguously placed. The 12 hydroxyl group does not have any 
apparent hydrogen-bonding partner. The taurine moiety binds 
between TM1 and TM10. Interaction of the taurocholate with residues 
in TM10 is in agreement with biochemical data, which indicate that the 
last helix in ASBT has a dominant role in the translocation process”. 
The location of Asn 265 between the TM4b and TM9b dipoles suggests 
that it may have a role in the mechanism. The importance of this 
residue has been inferred from mutagenesis studies on NTCP”. If 
Asn 265 is replaced by alanine in ASBTy™, transporter activity is 
reduced by ~80% (Fig. 1d and Supplementary Fig. 2a). Although there 
are clear similarities between the binding sites in ASBT yyy and ASBT, 
there are also sequence differences (Supplementary Fig. 5). Such dif- 
ferences may affect substrate specificity. 

For transport to take place, the protein must switch between out- 
ward- and inward-facing states’. The architecture of ASBTyyy pro- 
vides a clue to understanding how this might occur. The sodium ions 
are located in the core domain, close to the crossover points of the 
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Figure 4 | Putative mechanism for ASBT ym transport. a, Superposition of 
the ASBTyy, structure (red, panel domain; blue, core domain) and the 
outward-facing model as described in the text (light grey). The superposition 
has been optimized on the core domains. Loops have been removed for clarity. 
In the right-hand image, the panel domain of the model has been rotated by 25° 
relative to the core domain, around the axis shown in the left-hand image, to 
superimpose the panel domains. Significant kinks in the helices are represented 
as breaks. The area of the cavity is depicted by a tan-coloured trapezoid. 
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discontinuous helices and occluded from the bulk solvent. In NhaA, 
sodium binding causes a rearrangement of these helices**’’. In 
ASBTym, similar rearrangements in the core domain are therefore 
likely. Because NhaA translocates only protons and sodium ions” 
these helix movements might be sufficient for transport. However, 
ASBTym™ transports much larger substrates, and structural movements 
in more than the core domain are therefore needed. For the sodium- 
coupled transporter LeuT, the internal asymmetry of the repeating 
motifs has been used to predict global movements from a single 
structure’, and these movements have been substantiated by crystal- 
lographic studies”. In an analogous manner to LeuT, an outward- 
facing model of ASBTym was generated by superimposing TM1 to 
TM5 on TM6 to TM10, and vice versa (Fig. 4a; Methods). Comparing 
the inward-facing ASBT\ structure with the outward-facing model, 
the largest difference is the position of the panel domain relative to the 
core domain (Fig. 4a, c). A route through the protein between these 
domains is in agreement with experimental data, which suggest that 
the final helix of ASBT and TM9 (IX) of NhaA line the transport 
pathway***?°?°, Notably, the NhaA domain equivalent to the panel 
domain is located between that of the outward-facing and inward- 
facing ASBTy states (Fig. 4b). This may be because NhaA trans- 
locates a much smaller substrate, or it could represent another 
conformation of the transporter, probably an occluded state. 

We propose that sodium binding controls the conformation of the 
core domain of ASBTy™, which in turn drives the movement of the 
panel domain. This large conformational change of the panel domain 
relative to the core domain is required to alter the accessibility of the 
substrate-binding pocket. The ASBTy structure should aid the 
design of new inhibitors against ASBT with the goal of treating hyper- 
cholesterolaemia. 
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Transmembrane helices are numbered from 1 to 10; those coloured grey and 
labelled with an asterisk represent the helices in the outward-facing model. 

b, NhaA shown in the same view as ASBTyy ina. The core domain is shown in 
light blue and the panel domain is shown in maroon. The two additional 
transmembrane helices and {-strands that are not present in ASBT yyy are 
shown in grey. The position where sodium is thought to bind’ is shown with a 
black ring. c, Proposed transport mechanism, illustrating the movement of the 
panel domain relative to the core domain to transport sodium and bile acid. 
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METHODS SUMMARY 


ASBTym was cloned into a cleavable green fluorescent protein (GFP)/Hisg fusion 
vector, pWaldoGFPe’”. The fusion protein was expressed in E. coli, solubilized in 
1% dodecyl--p-maltopyranoside and purified to homogeneity. Before crystalliza- 
tion, untagged ASBTyyy was exchanged into 0.06% n-dodecyl-N,N-dimthylamine- 
N-oxide by size-exclusion chromatography. Crystals were grown in the presence of 
10mM taurocholate by the vapour diffusion method. Data were collected on 
beamlines 102 and 103 at the Diamond Light Source, UK, dehydration of the 
crystals being necessary to collect high-resolution data. The protein was deriva- 
tized by short-soaking a surface-engineered cysteine mutant (ASBTy 1) with 
1mM mercury acetate. The structure of ASBTymi was solved by single- 
wavelength anomalous dispersion and subsequently refined against data collected 
from ASBT ym at a resolution of 2.2 A. The cell-based bile acid uptake assay for 
ASBTym was modified from that previously described’. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 
ASBTym sequence. MNILSKISSFIGKTFSLWAALFAAAAFFAPDTFKWAGPY 
IPWLLGIIMFGMGLTLKPSDFDILFKHPKVVIIGVIAQFAIMPATAWLLSKLL 
NLPAEIAVGVILVGCCPGGTASNVMTYLARGNVALSVAVTSVSTLISPLLTP 
AIFLMLAGEMLEIQAAGMLMSIVKMVLLPIVLGLIVHKVLGSKTEKLTDALP 
LVSVAAIVLIGAVVGASKGKIMESGLLIFAVVVLHNGIGYLLGFFAAKWTG 
LPYDAQKTLTIEVGMQNSGLAAALAAAHFAAAPVVAVPGALFSVWHNISG 
SLLATYWAAKAGKHKKPGSENLYFQ 
ASBTym _1 sequence for structure solution. MVAASMNILSKISSFIGKTFSLW 
AALFAAAAFFAPDTFKWAGPYIPWLLGIIMFGMGLTLKPSDFDILFKHPKV 
VIIGVIAQFAIMPATAWCLSKLLNLPAEIA VGVILVGCCPGGTASNVMTYL 
ARGNVALSVAVTSVSTLTSPLLTPAIFLMLAGEMLEIQAAGMLMSIVKMVL 
LPIVLGLIVHKVLGSKTEKLTDALPLVSVAAIVLIIGA V VGASKGKIMESGLL 
IFAVVVLHNGIGYLLGFFAAKWTGLPYDAQKALTIEVGMQNSGLAAALAA 
AHFAAAPVVAVPGALFSVWHNISGSLLATYWAAKAGKHKKPLDRAGSEN 
LYFQ 
Expression screening, mutagenesis and protein purification. Bacterial ASBT 
homologues were cloned as GFP-Hisg fusions into the vector pWaldoGFPe"®, as 
fluorescence from a C-terminal GFP fusion is a reliable reporter of membrane- 
integrated expression’. Fusions were overexpressed in E. coli C43(DE3) cells** by 
the addition of 0.4mM IPTG at Agoonm 0.4. The temperature was decreased to 
25°C for overnight induction. The monodispersity of expressed fusions were 
screened in crude DDM., decyl-B-b-maltopyranoside-, nonyl-B-b-maltopyranoside-, 
LDAO- or dodecyl nonaethylene glycol ether (C,E,)-solubilized membranes by 
fluorescence-detection size-exclusion chromatography” (FSEC). The ASBTym 
homologue from N. meningitidis (MC58) was selected for structural studies on 
the basis of the amount of protein produced, as judged by whole-cell?’ and in-gel 
fluorescence’ and the quality of the FSEC trace in different detergents. Site-directed 
mutants of ASBTyy were generated by PCR (Quickchange, Agilent Technologies). 
Wild-type ASBTxm and mutants were purified essentially as previously 
described**. In brief, membranes were isolated from 10-1 E. coli cultures and 
solubilized in 1% DDM for 2h in buffer containing 1 PBS, 150 mM NaCl and 
10 mM imidazole. The suspension was cleared by ultracentrifugation at 120,000g 
for 1h. The sample was mixed with 1 ml of Ni-NTA Superflow resin (Qiagen) per 
1 mg of GFP-Hisg and incubated for 2h at 4°C. Slurry was loaded onto a glass 
Econo-Column (Bio-Rad) and washed in <1 PBS buffer containing 0.1% DDM, 
150 mM NaCl and 20 mM imidazole for 20 column volumes. Bound material was 
washed for a further 20 column volumes in the same buffer containing 50 mM 
imidazole. The ASBT\j—GFP-Hisg fusion was eluted in two column volumes of 
the same buffer containing 250 mM imidazole. The eluted protein was dialysed 
overnight in the presence of stoichiometric amounts of Hisg-tagged tobacco etch 
virus protease in 31 of buffer containing 20 mM Tris-HCl, pH 7.5, 150 mM NaCl 
and 0.03% DDM. Dialysed sample was passed through a 5-ml Ni-NTA His-Trap 
column (GE Healthcare), and the flow-through containing ASBTym was collected. 
Protein was concentrated to 10mgml! using concentrators with a relative 
molecular mass cut-off of 100K, and was loaded onto a Superdex 200 10/300 gel 
filtration column (GE Healthcare) equilibrated in 20mM Tris-HCl, pH 7.5, 
150mM NaCl and 0.06% LDAO. The choice of the detergent LDAO was con- 
sidered suitable for crystallization by comparing FSEC* and stability data** for 
ASBTym with membrane proteins known to crystallize in this detergent’’. The 
protein peak was collected and concentrated to 20 mg ml ' for crystallization. 
Transport time course. E. coli cells harbouring wild-type ASBT\y—GFP-Hisg 
were collected and resuspended in uptake buffer consisting of 1 mM CaCl, 1 mM 
MgCl, 10mM Tris-HCl, pH 7.5, and either 137mM NaCl (Na*-containing 
buffer) or 137mM choline chloride (Na* -low buffer). Cells were incubated at 
37°C with uptake buffer containing 41M taurocholate supplemented with 
0.16 uM [2,4-°H]-taurocholate (30 Ci mmol !; | American Radiolabelled 
Chemicals) for the indicated time intervals. Transport was terminated by the 
addition of ice-cold buffer containing 1 mM CaCh, 1mM MgCh, 10mM Tris- 
HCl, pH 7.5, 137 mM NaCl or 137mM ChCl, and 1 mM taurocholate, and was 
followed immediately by centrifugation at 20,500g for 60s. Cell pellets were 
washed several times in an equal volume of termination buffer and resuspended 
in 200 ul of the same buffer. The radioactivity corresponding to the internalized 
substrate was measured by scintillation counting. Each experiment was performed 
in triplicate. Nonspecific uptake was assessed by repeating the time course in 
triplicate for cells transformed with the same vector but expressing the sodium/ 
proton antiporter fusion NhaA-GFP-Hisg. In all experiments, ASBT ym expres- 
sion was calculated on the basis of GFP fluorescence measured at 510 nm (excita- 
tion wavelength, 488nm) using a 96-well spectrofluorometer*’. In-gel 
fluorescence and FSEC data of DDM-solubilized whole cells of wild-type 
ASBTym and mutants were carried out essentially as described previously'®”’. 
Transport kinetics. The accumulation of taurocholate was linear within the first 
120s. For kinetic characterization, the initial velocity of taurocholate uptake at 
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37°C was measured after 120s at the indicated increasing substrate concentra- 
tions. The radioactivity corresponding to the internalized substrate was measured 
by scintillation counting. Nonspecific uptake was measured by repeating the 
transport kinetics for cells transformed with the same vector, but expressing the 
sodium/proton antiporter fusion NhaA-GFP-Hiss. Specific ASBT ym uptake was 
calculated by the subtraction of nonspecific uptake from total uptake. Each experi- 
ment was performed in triplicate. The data were fitted to the Michaelis-Menten 
equation by nonlinear regression using the GRAPHPAD PRISM software. 
Activity of ASBTyy mutants. E. coli cells harbouring ASBT\j—GFP-Hisg 
mutants were resuspended in uptake buffer containing 41M taurocholate supple- 
mented with 0.16 [iM [2,4-*H]-taurocholate (30 Cimmol” '; American Radiolabelled 
Chemicals) for 5 min at 37°C. The radioactivity corresponding to the internalized 
substrate was measured by scintillation counting. For each mutant, the uptake 
values were corrected for background by subtracting values from parallel assays 
carried out in the absence of sodium. Activities were plotted as percentages of the 
wild-type transport activity calculated in the same way. Each experiment was 
performed in triplicate. 

Substrate specificity. The whole-cell (*H]-taurocholate uptake assay was carried 
out similarly to that described for ASBT yyy mutants, except that 150 1M of either 
taurocholate (Sigma), cyclosporin A (Sigma), bromosulfophthalein (Sigma) or 
fluvastatin (Cayman Europe) was added to the uptake buffer. 

Crystallization and preliminary screening. Crystals were grown at 20°C using 
the vapour diffusion method. Taurocholic acid (Sigma) was added to the protein 
solution to a final concentration of 10 mM. The protein was then mixed 1:1 with 
reservoir solution containing 50mM sodium citrate, pH 4.5, 70 mM NaCl and 
22-24% PEG 400. Crystals appeared overnight and reached a maximum size 
after 3-4 days. The crystals were frozen in liquid nitrogen and screened using 
synchrotron radiation at the European Synchrotron Radiation Facility and 
Diamond Light Source. Crystals are tetragonal with cell dimensions of approxi- 
mately 75A X 75 A X 180A. The best of these crystals diffract to around 2.8- 
3.5 A; however, with dehydration the diffraction increases to ~2 A. 

Structure determination of a cysteine mutant of ASBT ym. As initial attempts at 
making heavy-atom derivatives with mercury compounds failed, Leu 87 was 
modified to cysteine (construct ASBTy™m 1). The ASBTyy , protein crystallized 
similarly to the wild-type protein. Mercury-derivatized crystals were obtained 
from this mutant by incubating for 1h with 1mM mercury acetate before crys- 
tallization. A single mercury-derivatized crystal of ASBTNm_ was used to solve the 
structure by single-wavelength anomalous dispersion. The crystal was frozen in 
liquid nitrogen and then re-annealed before data collection by leaving it in air for 
approximately 3 min. The re-annealing resulted in shrinkage of the unit cell and an 
increase in the resolution to 2.2A. Data were collected at the mercury edge 
(1.0060 A) on beamline 103 at the Diamond Light Source. 

Data were initially processed to 2.5 A by the XIA2® pipeline to XDS” set up on 
the beamline, with further processing using the CCP4 suite of programs”. The 
space group was determined to be P4,22, with one molecule in the asymmetric 
unit. An anomalous difference Patterson map showed clear peaks associated with 
one bound heavy atom. The heavy-atom coordinates were determined using 
RSPS”. Its position was refined and phases were calculated using SHARP“ with 
solvent flattening in SOLOMON”. The resulting phases were input to the auto- 
matic structure building implemented in PHENIX”. This resulted in a model that 
was reasonably complete. Modification and further building of the structure was 
carried out in O* and COOT™. At this point, the data were reprocessed using 
MOSELM“*, extending the resolution to 2.2 A as judged from the scaling statistics 
(Supplementary Table 1) and the features in the resulting maps. 

Structural refinement was performed in BUSTER” using individual isotropic 
B-factor refinement and TLS”. The complete protein was chosen as a single TLS 
group because no significant drop in the R¢ree value was observed when splitting 
the protein into multiple groups. 

Two ions were identified in the core of the protein. The residues coordinating 
these ions and the associated distances are consistent with their being sodium 
ions**. As an additional verification, the putative sodium ions were changed to 
water molecules and run through the program WASP”, which uses valence cal- 
culations to identify possible metal ions. After replacing all solvent and ions by 
water molecules as required by the program, only the two solvent molecules 
originally assigned as sodium ions were flagged as likely sodium ions. After all 
residues had been modelled, clear electron density remained in the cavity of the 
protein. This density was enhanced in a simulated-annealing omit map calculated 
in PHENIX”. The taurocholate structure, downloaded from the Cambridge 
Structural Database (accession code, KORZUM), clearly fitted the density with 
the cholate head group positioned at the bottom of the cavity (Supplementary 
Fig. 9b). A further taurocholate was observed in the crystal interface. The final 
model has an R-factor of 19.7% and a corresponding Rgree value of 22.9%, and 
contains all protein residues from 2 to 309, two sodium ions, one mercury atom, 
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two taurocholate molecules, 37 water molecules, five LDAO molecules and two 
truncated phospholipids (phosphatidylethanolamine). The final refinement statistics 
of this model, which was used to solve the wild-type protein, are summarized in 
Supplementary Table 2. 

Structure determination and refinement of ASBT ym. Because the re-annealing 
of the ASBTyy 1 construct in air was not reproducible, dehydration was 
attempted using the humidity controller HC1 device’? mounted on beamline 
102 at the Diamond Light Source. By placing the crystal into an air stream at 
45% relative humidity for 5 min before freezing it, crystals were found to repro- 
ducibly diffract to ~2.0 A. Data were collected from a single crystal of ASBT yyy on 
beamline 102 at the Diamond Light Source. The data were processed in XDS”’ 
using the XIA2 pipeline® and scaled at a resolution of 2.2 A (Supplementary Table 1). 
The structure was refined, as above, starting from the final model of the ASBTyyy 1 
construct, minus all non-protein residues. No appreciable differences were observed 
in the wild-type and mercury-derivatized structures. Taurocholate and detergent 
molecules were modelled in the same positions as for ASBTyy,_1. The final model has 
an R-factor of 21.2% and an Rgee value of 24.4% (Supplementary Table 2). 
Structural analysis. Superpositions were carried out in LSQMAN"". The super- 
positions were performed so that only Co pairs that were less than 3.8 A apart were 
included in the calculation. The numbers quoted in the text regarding the 
topology-inverted repeats of ASBTxm were calculated between pairs of Co, atoms 
that were less than 10 A apart. This was considered necessary to include atoms 
from both the V-motif and the core motif. In comparing ASBT\m with NhaA 
(Protein Data Bank ID, 1ZCD), only pairs of atoms less than 5A apart after 
superposition were chosen, giving an r.m.s.d. of 2.9A for 202 out of a possible 
308 pairs of Cx atoms. The volume of the cavity was calculated in VOIDOO” using 
a probe radius of 1.4 A. Figures showing the structure were drawn using PYMOL™ 
except those showing electron density, which were made using CCP4MG™. 
Outward-facing model. As in LeuT*’, in ASBTyyy, the protein is made up of two 
five-transmembrane-helix repeats that when superimposed show a small rotation 
of two transmembrane helices with respect to the other three (Supplementary 
Fig. 4). For LeuT, it was shown that by swapping the conformations of the 
N- and C-terminal topology-inverted repeats the structure can be changed from 
an outward-facing state to an inward-facing state**. In ASBT yy, the lengths of the 
two topology-inverted repeats are very similar. To create an outward-facing back- 
bone model of ASBTyw, in an analogous manner to that carried out for LeuT, 
TM1 to TM5 were superposed on TM6 to TM10, and vice versa. 
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Two distinct microbial processes, denitrification and anaerobic 
ammonium oxidation (anammox), are responsible for the release 
of fixed nitrogen as dinitrogen gas (N,) to the atmosphere’~. 
Denitrification has been studied for over 100 years and its inter- 
mediates and enzymes are well known’. Even though anammox is a 
key biogeochemical process of equal importance, its molecular 
mechanism is unknown, but it was proposed to proceed through 
hydrazine (N,H,)°’. Here we show that N,H, is produced from the 
anammox substrates ammonium and nitrite and that nitric oxide 
(NO) is the direct precursor of N,H,. We resolved the genes and 
proteins central to anammox metabolism and purified the key 
enzymes that catalyse N2H, synthesis and its oxidation to Np. 
These results present a new biochemical reaction forging an N-N 
bond and fill a lacuna in our understanding of the biochemical 
synthesis of the N, in the atmosphere. Furthermore, they reinforce 
the role of nitric oxide in the evolution of the nitrogen cycle. 

Ammonium is difficult to activate in the absence of molecular oxygen. 
Therefore, how anammox bacteria are able to oxidize ammonium 
coupled to the reduction of nitrite and forge an N-N bond to make 
N; has been an intriguing question for a long time. Based on the in silico 
analysis of the genome assembly of the anammox bacterium Kuenenia 
stuttgartiensis, a set of three redox reactions (equations (1)-(3)) invol- 
ving NH, and nitric oxide (NO) was proposed? to explain the overall 
anammox stoichiometry (equation (4)): 


NO, +2H* +e =NO + H,0 (&’ = +0.38V) (1) 

NO + NH,* + 2H~* + 3e =N>H, + H2O (Ey’ = +0.06 V) (2) 
NH, =N> + 4H + 4e (Ey! = —0.75 V) (3) 

NH,* + NO,” =N> + 2H,0 (AG = —357kJmol”') (4) 


The role of NoH, in anammox catabolism was originally proposed based 
on the observation that the compound transiently accumulated when 
anammox bacteria were incubated with millimolar quantities of hydro- 
xylamine”*. However, the turnover of neither N>H,, hydroxylamine nor 
NO was demonstrated to start from the actual substrates ammonium 
and nitrite; thus it remained unclear whether the observed reaction was 
an integral part of the anammox pathway or a side reaction. 

In the present study, we resolved the anammox pathway and its 
enzymes by a combination of complementary approaches (Fig. 1). 
K. stuttgartiensis was enriched and grown as suspended cells in a 
membrane bioreactor”’®. Fluorescence in situ hybridization (FISH) 
showed that K. stuttgartiensis made up more than 95% of the popu- 
lation. Transcription was shown for more than 97% of all genes after 
random hexamer-primed reverse transcription of extracted RNA, 
sequencing and mapping of 5.6 million 32-nucleotide reads on an 
Illumina Genome Analyser (metatranscriptome accession number 


GSE15408). Expression of 1010 proteins was demonstrated by meta- 
proteomics’! (peptidome accession number PSE111). Further, 
inhibitor and isotope labelling studies were performed and the activity 
of enzyme complexes was demonstrated after their purification by 
liquid chromatography. 

Transcriptomics and proteomics indicated that K. stuttgartiensis 
expressed cd, nitrite::nitric oxide reductase (NirS, kuste4136, 9% of 
predicted peptides detected (p.p.d.) and 6.3-fold messenger RNA 
(mRNA) coverage) with the potential ability to reduce nitrite to NO. 
This possibility was investigated by incubating cell suspensions of 
K. stuttgartiensis with ammonium, nitrite (2mM each) and 100 WM 
NO scavenger PTIO (2-phenyl-4,4,5,5,-tetramethylimidazoline- 
1-oxyl-3-oxide)'*. When PTIO was introduced at the start of the 
incubation or when it was added to active cells, anammox activity 
was inhibited (Fig. 2a). Further, the cells were incubated with ammo- 
nium and nitrite (2mM each) in the presence of DAF2-DA (10 nM) 
that reacts with NO to form a fluorescent product'*"*. Sampled K. 
stuttgartiensis cells displayed the characteristic green fluorescence 
indicating NO production (Fig. 2b and Supplementary Fig. 1). In 
control experiments without nitrite or with added PTIO, there was 
no detectable fluorescent signal. It should be noted that both PTIO and 
DAF2-DA might have a wider reaction spectrum than NO and might 
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Figure 1 | Biochemical pathway and enzymatic machinery of K. 
stuttgartiensis. The anammoxosome, an intracytoplasmic compartment 
bounded by a membrane (grey line), is the locus of anammox catabolism. 
Identifiers of open reading frames and the degree to which the encoded 
respiratory protein complexes were detected in the proteome are indicated. 
Hydrazine synthase depicted in the centre of the figure is also loosely 
membrane associated. Yellow arrows, electron flow; yellow square, iron- 
sulphur clusters; b, haem b; c, haem c; c!, atypical haem c; d, haem d; Mo, 
molybdopterin. Cofactors and motifs were determined previously’. 
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Figure 2 | Determination of nitric oxide (NO) as an intermediate. NO,~ 
and NH,” (2 mM each) conversion was inhibited by 100 uM PTIO (a). PTIO 
added at t = 0 (open triangle), PTIO added at 40 min (open square) and 
without PTIO (open circle), n = 2 (error bars, s.d.). (b) Epifluorescence image 
of (diaminofluorescein-2-diacetate) DAF2-DA derivative of NO formed during 
NH,” and NO,” (2mM each) conversion by anammox bacteria (scale bar, 
10 um). 


possibly react with other nitrogen monoxides such as nitroxyl (HNO). 
However, unlike NO, HNO was not a suitable substrate for hydrazine 
synthase (see below). 

Interestingly, when acetylene (151M) was added, the anammox 
reaction was inhibited. Acetylene inhibits aerobic ammonium oxida- 
tion by binding covalently to the ammonia monooxygenase, the 
ammonia-activating enzyme of aerobic ammonium oxidizers'*"””. 
Apparently, it also interfered with the ammonium-activating step of 
anammox cells (equation (2)). Importantly, acetylene inhibition 
resulted in an immediate accumulation of NO; hydroxylamine accu- 
mulation was not observed, consistent with the role of NO as the direct 
precursor for N2Hy. 

The second step of the predicted anammox pathway would then be 
the reduction of NO and its simultaneous condensation with ammo- 
nium to produce NH, (equation (2)). Because the role of NjH, in 
anammox catabolism was not established, we first demonstrated its in 
vivo turnover (Fig. 3a, b). To investigate whether NH, could be pro- 
duced directly from NO, cell suspensions were incubated with NO 
(0.1mM) and ammonium (2mM). A transient accumulation of 
hydrazine (181M) was observed (Fig. 3c), albeit at a much lower 
concentration (200-500 1M) for incubations with hydroxylamine”®. 
This is consistent with equations (1)-(3) because the major part of the 
produced NH, would be oxidized to N, as expected from the overall 
reaction and NO could be supplied at much lower concentrations 
(equation (5)). 


NO + NH, = “NH, + H,O + %4N>, + H* (5) 


The anammox pathway is completed by the oxidation of N,H, to N, 
(equation (3)). For a long time, NH, was known as an alternative 
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Figure 3 | Hydrazine turnover. K. stuttgartiensis cells were incubated with 
2mM '°NO, and ‘*NH," each in the presence of 2 mM ?8N,H4. Under these 
conditions cells would only produce 7°N,H, and preferentially consume 
8NHa, leading to 1N-label accumulation in the NH, pool. The 295 and 296 
m/z masses correspond to derivatization products of ?8N,H, and *°N,Hy with 
para-dimethylaminobenzaldehyde” (a). The 294 m/z mass arises from the 
impurities of the matrix. Within 15 min, 16% of the N2H, pool was labelled 
(b). Hydrazine (open circles) was produced by the cells incubated with 2 mM 
NH,” (open triangles) and NO (0.1 mM) (open squares), = 2 (error bars, 
s.d.) (c). 


substrate for octahaem hydroxylamine oxidoreductases (HAOs), the 
enzymes that catalyse the conversion of hydroxylamine to nitrite in 
aerobic ammonium oxidizers’*”’. Strikingly, the K. stuttgartiensis 
genome encoded ten divergent paralogues of this enzyme, and six were 
detected at high levels in the transcriptome and proteome (mRNA up 
to 189-fold coverage, 27-58% p.p.d.; Supplementary Table 1). Six 
expressed paralogues belonged to the ‘type IP hydrazine/hydroxyla- 
mine oxidoreductases (HZO/HAO)*”. Two related ‘type I? HZO/ 
HAO and one divergent octahaem cytochrome c were also detected 
at lower levels (4-15% p.p.d.) and one was not detected. By a two-step 
liquid chromatography procedure, we purified two highly expressed 
HZO/HAO-like proteins (kustc0694 and kustc1061). These enzymes 
appear to be closely related to two enzymes of unknown function 
isolated from an anammox enrichment culture KSU-1 (refs 21, 22). 
Both enzymes catalysed the four-electron oxidation of NH, to N2 with 
cytochrome cas the artificial electron acceptor with different rates (2.5 
and 0.4 umol min‘ mg protein’, respectively). When they were incu- 
bated with *°N,H, and cytochrome c, 3°N, ('°N}°N) was produced 
stoichiometrically, in agreement with equation (3). Interestingly, 
Kustcl061 also oxidized hydroxylamine to NO (rather than nitrite) 
with a higher rate (6 umol min’ ' mg protein’ '). In contrast, kustc0694 
did not catalyse this reaction and hydroxylamine and NO were 
powerful inhibitors of N,H, oxidation, suggesting kustc0694 was the 
dedicated hydrazine dehydrogenase (HDH) in K. stuttgartiensis. 
Furthermore, the inhibition of kustc0694 explained the transient accu- 
mulation of NH, in the presence of hydroxylamine or NO. 
Although no enzyme is known to convert NO and ammonium into 
N2Hy,, two candidate gene clusters were previously identified poten- 
tially encoding an enzyme complex with this function (hydrazine 
synthase, HZS)°. One of these clusters (kuste2859-61) encoded the 
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most highly expressed proteins in the proteome (greater than 60% 
p-p.d., visible as three dominant spots on two-dimensional gels; 
Supplementary Table 1 and Supplementary Fig. 2a) and extremely 
abundant mRNAs in the transcriptome (greater than 50-fold 
coverage). The transcription of the other candidate cluster (kuste2474- 
83) was well below average (1.7-fold coverage) and expression was not 
detected by proteomics. 

The kuste2859-61 proteins were purified from the cell-free extract 
of the K. stuttgartiensis as a complex that separated into three distinct 
bands on a denaturing polyacrylamide gel, corresponding to polypep- 
tides encoded by three consecutive genes (kuste2859-2860-2861, Sup- 
plementary Fig. 2). Native polyacrylamide gel electrophoresis revealed 
that the complex was a multimer of approximately 240 kDa. Hydrazine 
synthesis activity of the complex was shown ina coupled assay with the 
kustcl061 HZO/HAO, using 'SN-ammonium (1mM) and NO 
(0.9 mM) as substrates (Fig. 4). In the assay, kustc1061 would ‘pull’ 
the reaction by rapidly oxidizing the produced NjH, to *’N2 as the end 
product, while simultaneously ‘pushing’ the reaction by providing the 
electrons for N,H, synthesis (equations (2) and 3). Kustc1061 alone 
did not catalyse the reaction, and N, production above background 
could not be measured in the absence of ammonium or NO. N2 was 
not produced above background when hydroxylamine or nitroxyl 
(HNO) were provided as substrates with ammonium. The activity of 
N, formation in the coupled assay was 20nmolh mg protein ', 
lower than the activity of whole cells with ammonium and nitrite 
(approximately 1800 nmolh ' mg protein” '). The cell-free extracts 
were unable to form N> from ammonium and nitrite, but could from 
NO and ammonium under the same experimental conditions, at six- 
fold lower rate than the purified HZS (3.4nmolh ' mg protein” '). 
The decrease in activity upon mere cell disruption was most probably 
due to the disruption of a tightly coupled multi-component system 
with hydrazine synthesis as the rate-limiting step. 

Interestingly, the kuste2859-61 complex was capable of N2 forma- 
tion from ammonium and NO on its own (Fig. 4). The purified enzyme 
oxidized N.H, to N2 with a specific activity of 34nmol min | mg 
protein’, resulting in an overall disproportionation reaction (equa- 
tion (5)). Considering that NH, is the energy source in anammox 
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Figure 4 | ?°N, production by hydrazine synthase complex and kustc1061 
from °NH,* and NO. *°N, was produced with the highest rate when 
hydrazine synthase complex (1.6 mg) and kustc1061 (4.7 11g) was incubated 
with '"NH,* (1 mM), “NO (0.9mM) and cytochrome c (50 UM) (filled 
circles). In the control experiments, hydrazine synthase complex and 
cytochrome c (open circles), kustcl1061 and cytochrome c (open diamonds), 
cytochrome c (filled squares) and only buffer (open triangles) were incubated 
under the same experimental conditions; n = 3 (error bars, s.d.). 
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metabolism, N2 formation by HZS would be unproductive. Con- 
sequently, we may speculate that the anammox bacterium harbours 
backup systems that efficiently trap hydrazine and that keep (nitro- 
genous) inhibitory compounds, like NO and hydroxylamine, at low 
concentrations, which would partly explain the redundancy of HAO/ 
HZO-like proteins in the organism. Our experiments showed that HZS 
and HDH were necessary and sufficient to make N> from the substrate 
ammonium and the intermediate NO. 

Taken together, anammox catabolism and energy for growth must 
be conserved from three reactions (equations (1)-(3)). It is hypothe- 
sized that anammox bacteria synthesize ATP through a membrane- 
bound ATP synthase complex driven by proton-motive force (pmf) 
generated through catabolic reactions with the intermediary action of 
the quinol::cytochrome c oxidoreductase system (complex III, the bc, 
complex). 

Intriguingly, three gene clusters encoding bc, complexes and four 
encoding ATP synthases were present in the K. stuttgartiensis genome. 
Transcription and expression of one (kuste4569-74) of these gene 
clusters were detected at higher levels (26-33% p.p.d., 6- to 24-fold 
mRNA coverage) than the other two (0-19% p.p.d., 2- to 15-fold 
mRNA coverage). When K. stuttgartiensis cell suspensions were spiked 
with pentachlorophenol (10 1M), a structural analogue of quinol anda 
known inhibitor of the bc; complex, anammox activity was completely 
inhibited, indicating that the bc; complex was involved in energy 
conservation and its role in electron transport from N2H, oxidation 
to nitrite reduction and hydrazine synthesis was not backed up by any 
other system. The expression of the four gene clusters encoding ATP 
synthase was even more skewed. Peptide coverage for kuste3789-96 
was 14-58% p.p.d. compared with less than 1% for the other three ATP 
synthases, and mRNA coverage differed by a factor of six. The gene 
product encoding the catalytic B-subunit of the highest expressed ATP 
synthase (kuste3787-96) was recently shown to be associated with the 
membranes of the intracellular cell compartment, the anammoxo- 
some, suggesting it to be the site where the proton-motive machinery 
resides”. 

In the present study we experimentally identified NO and N,H, as 
the intermediates of anaerobic ammonium oxidation. The highly 
expressed protein encoded by the gene cluster kuste2859-61 was puri- 
fied and N-N bond formation from NO and ammonium was demon- 
strated. Hydrazine synthase and the NO reductase of denitrifiers are 
the two enzymes capable of bonding two N atoms together. In contrast 
to NO reductase, hydrazine synthase combines two different nitrogen- 
ous molecules. It is intriguing that all the N, in our atmosphere is 
formed by the oxidizing power of NO, in line with the hypothesis that 
NO may have been the first deep redox sink on Earth”. 


METHODS SUMMARY 

Activity Measurements. Physiological experiments were performed at 33 °C, pH 
7.5 with K. stuttgartiensis cells”'°. To determine the role of NO and hydroxylamine 
in the anammox metabolism, cells were incubated with (1) NaNO;, NH,Cl (2 mM 
each) and spiked with acetylene (15 uM); (2) NaNO,, NH4Cl (2mM each) and 
DAF-2DA (10M) or PTIO (100M); (3) NO (0.1mM) and 2mM NH,Cl. 
Hydroxylamine, NH,*, NO, and N>Hy were determined as previously 
described****. NO was measured online as previously described”’. To determine 
N>H, turnover, cells were incubated with Na’°NO,, NH,Cl (or vice versa) and 
NH, (2mM each). Isotopic composition of hydrazine was determined with 
matrix-assisted laser desorption/ionization-time of flight mass spectroscopy 
(MALDI-TOF MS) after para-dimethylaminobenzaldehyde derivatization, 
developed after Watt and Chrisp. All labelled compounds were 99% pure 
(Cambridge Isotope Laboratories). 

Proteins were purified from cell-free extracts with anion exchange and hydro- 
xyapatite liquid chromatography. Activity measurements were performed at 
37°C, pH 7 in an anaerobic chamber. Kuste2859-61 (1.6 mg) and kustcl061 
(4.7 |tg) were incubated with NO (0.9 mM) and 'SNH,Cl (1 mM), and 7°N, pro- 
duction was monitored by gas chromatography (Agilent 6890 with a PorapakQ 
column, 80°C) combined with a mass spectrometer (Agilent 5975c quadruple 
inert MS). For rate calculations, kustc0694 (1.3 1g) or kustcl061 (4.7 1g) were 
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incubated with NH, or hydroxylamine and cytochrome c (50 uM each), and *°N, 
or *'NO production was measured. 
Molecular methods. Total RNA was extracted, reverse transcribed, sequenced 
with [lumina and mapped to the genome sequence of K. stuttgartiensis°. From the 
aligned reads, per-position coverage was calculated for each contig and used to 
calculate the coverage for each orf, intergenic region and predicted RNA element. 
Cell free extracts were separated by SDS-polyacrylamide gel electrophoresis 
(SDS-PAGE) or two-dimensional gel electrophoresis, digested with trypsin and 
analysed with liquid chromatography—mass spectrometry (LC-MS/MS)**”’. Mass 
spectrometry data was searched against a database of predicted K. stuttgartiensis 
peptide sequences. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Source of the biomass. K. stuttgartiensis cells were collected from a 10-1 
laboratory scale anammox membrane bioreactor”’® and were concentrated by 
centrifugation. The cells were re-suspended to a protein concentration higher than 
lmg ml 1. Part of the cell suspension was diluted 100 times, chemically fixed, and 
hybridizations with fluorescently labelled oligonucleotide probes were performed 
as described previously*!”. 

Sample preparation. The cell suspensions were transferred to 8-ml serum bottles. 
The vials were made anoxic by alternately applying under-pressure and He or Ar 
seven times and were transferred to an anaerobic chamber with a 95%/5% Ar/H, 
atmosphere. O, in the Ar in the anaerobic chamber was removed by passing Ar 
over a Pd catalyst (0.2 p.p.m. residual O2). In the anaerobic chamber, cell suspen- 
sions were diluted five times with anaerobic mineral medium”? (pH 7.5) to a final 
volume of 40 or 8 ml and transferred to glass vials unless stated otherwise. All 
preparations (for example, addition of substrates and/or inhibitors) for different 
incubations were handled in the anaerobic chamber. All experiments were per- 
formed at least in duplicate. All non-labelled salts were purchased as molecular 
grade (more than 99.95% pure, Merck) unless stated otherwise. All labelled com- 
pounds were 99% pure and purchased as sodium or chloride salts (Cambridge 
Isotope Laboratories). All gaseous compounds were of the highest purity available. 
Analytical methods. NO, and NH,* were determined as described previously’. 
NH, was determined colourimetrically at 420 nm after reaction of 100-11 sample 
with 900 pl 2% (w/v) para-dimethylaminobenzaldehyde (PDB), 3.7% (v/v) HCl in 
ethanol’®. NH3OH (detection limit 5 1M) was determined as previously described’. 
Effect of PTIO. To determine the effect of PTIO, an NO scavenger’’, on anammox 
bacteria, three incubations were performed in parallel. NO, and N H,* (2mM) 
were added to all incubations. To the first incubation, PTIO (100 1M) was added at 
0 min, to the second it was added at 40 min, and no PTIO was added to the third 
incubation. Liquid samples were taken every 15 min and analysed for NH,", 
NO, and NH,OH as previously described”. 

Bioimaging of nitric oxide. To detect NO turnover, K. stuttgartiensis cell suspen- 
sions were incubated with 2mM NO, and NH," in amber vials. In parallel, 
nitrite-depleted cell suspensions were incubated in the presence of 2mM NH,". 
After a 5-min pre-incubation, diaminofluorescein-2-diacetate (DAF2-DA, 
Calbiochem) was added to a final concentration of 10 uM. The vials were incu- 
bated in the dark for 30 min at 33 °C and were shaken continuously at 300 r.p.m. 
As a negative control, cells were incubated with PTIO and DAF2-DA. Cells were 
then harvested by centrifugation, washed three times in mineral medium” to 
remove the excess chromophore and were re-suspended in mineral medium. A 
liquid sample (5 tl) of the suspension was pipetted on a microscope slide and dried 
in the dark. The preparations were examined with a Zeiss Axioplan2 epifluores- 
cence microscope. 

Batch experiments. To determine the activity of K. stuttgartiensis with NO and 
NH,", cell suspensions were incubated with NO (0.1 mM) and 2mM NH," in 
100-ml glass vials with 10% NO (in He) in the headspace. Gas samples were 
analysed in a chemoilluminescence NO, analyser (CLD 700EL, EcoPhysics, detec- 
tion limit 0.1 p.p.m. NO). Liquid samples were taken every 30 min and analysed 
for NH,* and N,H, as previously described?”™”. 

To determine the effect of acetylene on anammox bacteria, K. stuttgartiensis 

suspensions were transferred to 40-ml glass vials. NO. and NH,” were added to 
the incubations to a final concentration of 2 mM. The vials were incubated at 33 °C 
and were mixed with a magnetic stirrer at 500 r.p.m. and continuously flushed 
with Ar/CO3 (95%/5%) with a flow of 10 ml min’ '. The effluent gas from the vials 
was connected to a chemoilluminescence NO, analyser (CLD 700EL, EcoPhysics, 
detection limit 0.1 p.p.m. NO) for online NO measurement. At 15 min, 100 pl 
acetylene (15 1M) was added to the vials. As negative controls, 100 ul air and 
100 pl nitrogen were added to separate incubations. Liquid samples were taken 
from the incubations every 10-15 and chilled to 0 °C immediately. The super- 
natant of each sample was transferred to an Eppendorf cup and kept at 4°C until 
they were analysed for NH,*, NoHy and NH3OH. 
Detection of hydrazine turnover in anammox cells. To detect N.H, turnover, 
NO, , NHa* (or vice versa) and N»Hy (2mM each) were added to the K. 
stuttgartiensis cell suspensions. The vials were incubated in the dark for 15 min 
at 30 °C, 300 r.p.m. Liquid samples were taken every 5 min, and isotopic composi- 
tion of N,H, was determined by MALDI-TOF MS after reaction with PDB. For 
MALDI-TOF analysis, 10 ul of PDB-reacted samples were mixed with an equal 
volume of sample buffer containing 20 mg ml! «-cyano-4-hydroxycinnamic acid 
in 0.05% (v/v) trifluoroacetic acid (TFA), 50% (v/v) acetonitrile. The mixtures 
(0.3 ul) were spotted on a $26/100 M-probe (Bruker 15165), which was inserted 
into a multiprobe adaptor. MALDI-TOF MS measurements were performed in 
the mass range of 100-800 Da on a Bruker III mass spectrometer, using the 
reflectron mode. 
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Cytochrome bc, complex. To determine the role of cytochrome bc; complex in 
anammox catabolism, K. stuttgartiensis suspensions were incubated with penta- 
chlorophenol, a specific inhibitor of the bc, complex. NO, and NH,* were added 
to the incubations to a final concentration of 2 mM. Pentachlorophenol was added 
toa final concentration of 10 11M. NO, and NH,” were determined as described 
previously”. 

Preparation of cell free extract. K. stuttgartiensis cells (21, OD¢00 1.2) were har- 
vested from the membrane bioreactor. After centrifugation (4,000g, 4 °C), the pellet 
was re-suspended in one volume 20 mM potassium phosphate buffer, pH 8. Cell 
suspensions were passed three times through a French pressure cell operated at 
138 MPa. The lysate was incubated with 1% (w/v) sodium deoxycholate at 4 °C for 
1h to solubilize membrane associated proteins. After centrifugation for 15 min at 
1,700g at 4 °C, the cell-free fraction was obtained as clarified supernatant. 
Protein electrophoresis and MALDI-TOF analysis. Samples were denatured by 
incubation with 60 mM Tris-HCl buffer (pH 8) containing 5% B-mercaptoethanol, 
2% SDS (sodium dodecyl sulphate) and 25% glycerol for 5 min at 100°C. SDS- 
PAGE was performed in 10% or 6% slab gels in 375 mM Tris-HCl glycine buffer, 
pH8.8 according to Laemmli™*. Native PAGE (6%) was performed according to the 
same procedure with the following modifications: the protein preparations were 
not boiled before electrophoresis, SDS and §-mercaptoethanol were omitted from 
the gels, and Tris-HCl glycine (375 mM, pH 8.3) was used as the running buffer. 
Gels were stained with colloidal Coomassie blue as described elsewhere*. To 
identify the protein bands resolved in SDS-PAGE, gel spots (~3mm*) were 
picked, digested with trypsin and analysed with MALDI-TOF mass spectrometry 
as described elsewhere**. 

Purification of kuste2859-2860-2861, kustc0694 and kustcl061. Cell-free 
extract was centrifuged at 140,000g, 10 °C (Discovery 10, Sorvall, equipped with 
a T-1270 rotor) to remove the membranes. The supernatant was loaded on a 30 ml 
Q Sepharose XL (GE Healthcare) column equilibrated with 20 mM Tris-HCl, pH 
8. Kuste2859-2860-2861 and kustcl061 were eluted isocratically with 200 mM 
NaCl in 20 mM Tris-HCl, pH 8 (2 ml min '). Kustc0694 was eluted isocratically 
with 400 mM NaCl in 20 mM Tris-HCl, pH 8 (2 ml min ~ 1) Eluted fractions were 
subsequently loaded onto a 10 ml Hydroxyapatite (Bio-Rad) column equilibrated 
with 20 mM potassium phosphate buffer, pH 7 and eluted with a gradient of the 
same buffer (20-500mM, 2mlmin!). Kustcl061 and kuste2859-2860-2861 
were collected in fractions eluted at 100 mM and 200 mM phosphate, respectively. 
The pooled fractions were desalted and concentrated using Vivaspin tubes 
(100 kDa cut-off, Sartorius Stedim Biotech) to concentrations of at least 0.86 mg 
ml! (kuste2859-2860-2861) and 2.3 mg ml (kustc1061) in 20 mM phosphate 
buffer, pH 7. 

Detection of hydrazine and hydroxylamine oxidation by kustc1061 and 
kustc694. To 2 ml (final volume) of phosphate buffer (20 mM, pH 7), 4.7 ug of 
Kustcl061 or 1.3 ug of Kustc0694 and cytochrome c (50 uM final concentration, 
bovine heart, Sigma-Aldrich) were added to a 3-ml exetainer (Labco). To start the 
reaction to determine the electron stoichiometry, 10 4M, and for routine rate 
assays 50 11M, 1S\-labelled 9°N»H, was added from an anoxic stock. To determine 
the capacity for NH,OH oxidation, proteins were incubated in separate vials with 
50 pM NH,OH and cytochrome c (each). Exetainers were incubated at 37 °C in the 
anaerobic chamber. *°N, and ‘NO production was monitored by gas chromato- 
graphy (Agilent 6890 equipped with a Porapak Q column at 80 °C) combined with 
a mass spectrometer (Agilent 5975c quadruple inert MS). 

Combined assay of kuste2859-2860-2861 and kustcl061. Cytochrome c 
(50 UM final concentration, bovine heart, Sigma-Aldrich), Kustcl061 (4.7 1g), 
1mM '°NH,* and 5M N>Hy, were added to 1.6 mg of kuste2859-2860-2861 
in 1 ml phosphate buffer (20 mM, pH 7) in a 3-ml exetainer (Labco). The reaction 
was started by adding phosphate buffer (20 mM, pH 7) with NO (0.9mM) toa 
final volume of 2 ml. Before incubation at 37 °C in the anaerobic chamber, 1 ml of 
50% NO (in He) was added to the headspace. Control experiments were per- 
formed with ammonium (1 mM) with NH,OH (1mM) and HNO supplied as 
Angeli’s salt (41 mM) in separate incubations. *°N, production was monitored by 
gas chromatography (Agilent 6890 equipped with a Porapak Q column at 80 °C) 
combined with a mass spectrometer (Agilent 5975c quadruple inert MS). 
LC-MS/MS analysis and data processing. After PAGE, gels were stained with 
colloidal Coomassie blue as described elsewhere*’. The gel lane was cut into four 
slices and each slice was destained with three cycles of washing with 50mM 
ammoniumbicarbonate and 50% acetonitrile. Protein reduction, alkylation and 
digestion with trypsin were performed as previously described*’. After digestion, 
samples were de-salted and purified according to Rappsilber et al.’. Sample ana- 
lysis by LC-MS/MS was performed using an Agilent nanoflow 1100 liquid chro- 
matograph coupled online through a nano-electrospray ion source (Thermo 
Fisher Scientific) to a 7T linear ion trap Fourier transform ion cyclotron resonance 
mass spectrometer (LTQ FT, Thermo Fisher Scientific). The chromatographic 
column consisted of a 15-cm fused silica emitter (New Objective, PicoTip 
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Emitter, Tip: 8 = 1 jum, internal diameter 100 um) packed with 3-lum C18 beads 
(Reprosil-Pur C18 AQ, Dr Maisch GMBH)”. After loading the peptides onto the 
column in buffer A (0.5% HAc), bound peptides were gradually eluted using a 
67-min gradient of buffer B (80% ACN, 0.5% HAc). First, the concentration of 
acetonitrile was increased from 2.4 to 8% in 5 min, followed by an increase from 8 
to 24% acetonitrile in 55 min, and finally an increase from 24 to 40% acetonitrile in 
7 min. The mass spectrometer was operated in positive ion mode and was pro- 
grammed to analyse the top four most abundant ions from each precursor scan 
using dynamic exclusion. Survey mass spectra (350-2000 m/z) were recorded in 
the ion cyclotron resonance cell at a resolution of R= 5E5. Data-dependent 
collision-induced fragmentation of the precursor ions was performed in the linear 
ion trap (normalized collision energy 27%, activation q = 0.250, activation time 
30 ms). 

Mass spectrometric data files were searched against the K. stuttgartiensis data- 

base (known contaminants like human keratins and trypsin were added to the 
database) using the database search program Mascot (Matrix Science, version 2.2). 
To obtain factors for the recalibration of precursor masses, initial searches were 
performed with a precursor ion tolerance of 50p.p.m. Fragment ions were 
searched with 0.8-Da tolerance and searches allowed for one missed cleavage, 
carbamidomethylation (C) as fixed modification, and deamidation (NQ) and 
oxidation (M) as variable modifications. The results from these searches were 
used to calculate the m/z-dependent deviation, which was used to recalibrate all 
precursor m/z values. After recalibration of the precursor masses, definitive 
Mascot searches were performed using the same settings as stated above, but with 
a precursor ion tolerance of 20 p.p.m. Additionally, reverse database searches were 
performed with the same settings. Protein identifications were validated and 
clustered using the PROVALT algorithm to achieve a false-discovery rate of less 
than 1% (ref. 38). 
Two-dimentional gel electrophoresis. Before protein separation by isoelectric 
focusing, 1 mg of the protein suspension was incubated with 1% (v/v) Immobilized 
pH-gradient (IPG) buffer of the appropriate range, 5 mM tributyl phosphine and 
0.01% (w/v) bromophenol blue for 15 min at room temperature and centrifuged at 
10,000g for 15 min at 10 °C. Isoelectric focusing was performed with the IPGphor 
system using commercial 24-cm-long IPG strips with linear immobilized pH 
gradients of various ranges. The conditions for rehydration of the IPG strips, 
sample entry and isoelectric focusing were as follows: the temperature was set 
constant at 18 °C and 50-1A per strip were applied. 

Focused IPG strips were equilibrated before SDS-PAGE two times for 15 min in 
375 mM Tris-HCl pH 8.5, 2% (w/v) SDS, 20% (w/v) glycerol, 6M urea, 10 mM 
DTT, 50 mM acrylamide and 0.1% (w/v) bromophenol blue. Gels were run for 
45 min with constant cooling to 18°C at 20 V, 40 W and subsequently at 40 V, 
40 W until the bromophenol blue marker reached the end of the gel. Gels were 
fixed in 30% (v/v) ethanol and 10% (v/v) glacial acetic acid and were stained with 
colloidal Coomassie blue” or silver stain*®. Picked gel spots were digested and 
analysed with MALDI-TOF MS as described elsewhere’®. 

Blue native PAGE. Blue native PAGE of the protein complexes was performed as 
described elsewhere"’. For protein identification in two-dimensional gels, 16 cm 
X 20cm gels were self-casted according to Calvaruso et al. with the following 


exception: 4-10% linear polyacrylamide gradient was used*’. Sample additive 
(1.5 pl) (0.75 M 6-aminocaproic acid, 5% Serva Blue G) was added to 40 1g protein 
sample before loading the gel. 

Electrophoresis was performed at 50 V until the migration front entered the 

resolving gel and then at 100 V until the migration front reached the end of the gel. 
Cathode and anode buffer for blue native PAGE were 50 mM Bis-Tris, pH 7.0, and 
50mM Tricine, 15mM Bis-Tris, pH 7.0, respectively. Preparation of the first- 
dimension gel strip and assembly and casting of the second-dimension gel were 
performed as described elsewhere“! with the exception that the second-dimension 
cassette had the same thickness as the first dimension. No Coomassie blue was 
added to the cathode buffer. 
Transcriptomics. RNA was extracted using the Ribopure Bacteria Kit (Ambion) 
according to the manufacturer’s instructions. First-strand cDNA was synthesized 
with random primers using the RevertAid H Minus First Strand cDNA Synthesis 
Kit, and the second strand was synthesized using DNA polymerase and manu- 
facturer’s instructions (Fermentas). 

The quality scores of the obtained Solexa reads (3.5 million) were converted to 
PHRED format and mapped with Maq (http://maq.sourceforge.net) to the five 
contigs that constitute the K. stuttgartiensis genome (accession numbers 
CT030148, CT573071-4). From the aligned reads, the per-position coverage 
was calculated for each contig and used to calculate the coverage for each orf, 
intergenic region and predicted RNA element. 
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Metabolic priming by a secreted fungal effector 
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Maize smut caused by the fungus Ustilago maydis is a widespread 
disease characterized by the development of large plant tumours. 
U. maydis is a biotrophic pathogen that requires living plant tissue 
for its development and establishes an intimate interaction zone 
between fungal hyphae and the plant plasma membrane. U. maydis 
actively suppresses plant defence responses by secreted protein 
effectors’’. Its effector repertoire comprises at least 386 genes 
mostly encoding proteins of unknown function’** and expressed 
exclusively during the biotrophic stage’. The U. maydis secretome 
also contains about 150 proteins with probable roles in fungal nutri- 
tion, fungal cell wall modification and host penetration as well as 
proteins unlikely to act in the fungal-host interface* like a choris- 
mate mutase. Chorismate mutases are key enzymes of the shikimate 
pathway and catalyse the conversion of chorismate to prephenate, 
the precursor for tyrosine and phenylalanine synthesis. Root-knot 
nematodes inject a secreted chorismate mutase into plant cells likely 
to affect development**. Here we show that the chorismate mutase 
Cmul1 secreted by U. maydis is a virulence factor. The enzyme is 
taken up by plant cells, can spread to neighbouring cells and changes 
the metabolic status of these cells through metabolic priming. 
Secreted chorismate mutases are found in many plant-associated 
microbes and might serve as general tools for host manipulation. 

The U. maydis genome (http://mips.helmholtz-muenchen.de/genre/ 
proj/ustilago) contains genes for both a cytosolic chorismate mutase, 
designated aro7 (um04220), and a putatively secreted chorismate 
mutase, cmul (um05731). Cmul belongs to the AroQ class of 
eukaryotic chorismate mutases (Interpro: IPR008238) that have an 
all-o.-helical secondary structure (Supplementary Fig. 1)”*. To verify 
that Cmul is a dedicated chorismate mutase, we demonstrated that it 
complemented a Saccharomyces cerevisiae aro7 mutant (Fig. 1a) and 
that heterologously expressed protein had chorismate mutase activity 
which was not feedback inhibited by aromatic amino acids (Sup- 
plementary Fig. 2). Allosteric regulation is a characteristic feature 
of plastidic chorismate mutases as well as of cytoplasmic fungal 
chorismate mutases”'®, whereas cytosolic plant chorismate mutases 
lack this feature''. Attempts to generate a cmu1 mutant that displayed 
allosteric regulation based on features of S. cerevisiae Aro7p were 
unsuccessful (Supplementary Fig. 3). Western blot analysis detected 
Cmul-haemagglutinin (HA) in U. maydis culture supernatants when 
the respective fusion gene was expressed under a constitutive promoter 
in hyphal cells (Supplementary Fig. 4). The secretion of Cmu1 during 
plant colonization was independently demonstrated by proteome 
analysis of apoplastic fluids isolated after infection of maize with a 
mixture of compatible U. maydis strains. Compared with known 
secreted effector proteins like Mig2 (ref. 12), higher numbers of 
Cmul peptides were identified at all time points analysed (Sup- 
plementary Information, Table 1). 


Like many other U. maydis effectors with a virulence function’*”, 


cmul is specifically upregulated during biotrophic development 
(Supplementary Fig. 5) and is one of the most highly expressed fungal 
genes during plant colonization’. To determine a possible contri- 
bution to virulence cmu1 was deleted in the solopathogenic strains 
$G200 (ref. 3) and CL13. CL13 is the progenitor strain of SG200 that 
shows attenuated virulence’® (see Supplementary Fig. 6a for disease 
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Figure 1 | Cmu1 has chorismate mutase activity, affects virulence and 
salicylic acid levels. a, U. maydis cmu1 complements the aro7 deletion in S. 
cerevisiae. Growth of the S. cerevisiae aro7 deletion mutant Y05479 on medium 
lacking tyrosine and phenylalanine (SD-phe-tyr) is restored by introduction of 
U. maydis cmu1; whereas cmu1p1834,K193a does not complement. Expression of 
cmul genes was driven by the GALI promoter. S. cerevisiae Y00000 (native 
ARO7 gene) was used as positive control. YEPD, rich medium; gal, galactose. 
b, Deletion of cmu1 negatively affects virulence of U. maydis strain CL13. 
Disease symptoms (as depicted in Supplementary Fig. 6a) on maize plants were 
scored? 12 days after infection with the indicated strains. Mean values of seven 
independent infections are shown with the total number of infected plants 
indicated above each column. Compared with CL13 and CL13Acmul-cmul- 
HA, CL13Acmu1 showed significantly reduced tumour formation (t-test, 

P = 0.037). ¢, Total amounts of salicylic acid were determined in plant leaves 
infected with the indicated U. maydis strains listed below 8 days after infection. 
For the infections with CL13Acmul, three independent strains were used. 
Mean values of three independent experiments are shown. Error bars, s.d. 
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symptoms of CL13) and hence facilitates the detection of modest 
differences in virulence’®. Whereas SG200Acmul strains showed little 
virulence attenuation (Supplementary Fig. 6b), the CL13Acmul 
mutant displayed a reduction of about 50% in tumours, which could 
be complemented by introducing a single copy of cmul-HA (Fig. 1b). 
This illustrates that Cmul is required for full virulence and demon- 
strates functionality of the HA-tagged protein. 

To localize Cmul during biotrophic growth, plants were infected 
with SG200Acmul-cmul-HA, which carries a cmu1-HA fusion gene 
inserted in single copy under control of its native promoter. Plants 
infected with SG200 or with SG200 P...u4;GFP-HA expressing cyto- 
plasmic green fluorescent protein (GFP) under the cmu1 promoter 
served as negative controls. Freeze-substituted and resin-embedded 
sections of maize tissue harvested 3 days after infection with these 
strains were incubated with anti-HA antibodies and gold markers. 
Cmul-HA could be detected inside the fungal hyphae, in the bio- 
trophic interface as well as inside the plant cytoplasm but rarely in 
the plant cell wall (Fig. 2A and Supplementary Fig. 7). The distribution 
of gold particles was quantified (Fig. 2B). Gold labelling of plant tissue 
infected with the parental strain SG200 was negligible (Supplemen- 
tary Fig. 8), whereas non-secreted GFP-HA was absent from the bio- 
trophic interphase, showed strong accumulation in the fungal cytosol 
and weak background labelling in the plant cytosol (Supplemen- 
tary Fig. 9 and Fig. 2B). Integrity of Cmul-HA was demonstrated 
by western blot analysis after immunoprecipitation from infected 
plant tissue (Supplementary Fig. 10). To demonstrate Cmul local- 
ization independently, plants were infected with SG200Acmul- 
cmul-mCherry-HA. Cmul-mCherry-HA was detected in the 
biotrophic interface, and plasmolysis experiments showed that it freely 
diffused in the enlarged apoplast (Supplementary Fig. 11). However, 
fluorescence could not be detected inside plant cells. In addition, 
Cmul1-mCherry—HA was unable to complement the virulence pheno- 
type of CL13Acmul (Supplementary Fig. 12a) despite the fact that 
the fusion protein was enzymatically active as demonstrated by 
complementation of the aro7 yeast mutant (Supplementary Fig. 13). 
Cmul2-290>-HA lacking the secretion signal was unable to 
complement the virulence phenotype of CL13Acmul (Supplemen- 
tary Fig. 12b), demonstrating that secretion is prerequisite for func- 
tion. In sum, these data suggest that Cmu1 needs to enter plants cells to 
exert its function and that Cmul-mCherry-HA is unable to do so. 

To elucidate the subcellular localization of Cmul1 in plant cells, a 
Cmu12-299-mCherry fusion protein lacking the signal peptide and 
active in complementing the yeast aro7 mutant was transiently 
expressed in maize leaves (Supplementary Fig. 13). Cmul 2_299- 
mCherry localized to the cytoplasm and the nucleus of transformed 
maize cells (Fig. 2C). Surprisingly, in some cases the Cmul22_290- 
mCherry signal was also visible in cells adjacent to the originally 
transformed cell (Fig. 2C, a). To rule out that the latter is caused by 
independent transformation events, Cmu1 2 399-yellow fluorescent 
protein (YFP) and PIP42¢6_593-mCherry encoding a fusion protein that 
localizes exclusively to the nucleus, were co-expressed (Supplemen- 
tary Fig. 14). Cell-to-cell spreading of Cmu1  _99-YFP was observed 
in some cases whereas PIP496-593-mCherry always remained in the 
nucleus of the originally transformed cell (Supplementary Fig. 14). 
Occasionally guard cells were transformed, and in such cases spread- 
ing of Cmul 7 299-YFP was never observed (Fig. 2C, c and Sup- 
plementary Fig. 14). Because guard cells lack plasmodesmata”, 
the observed spreading of Cmul 2-299 is likely to occur through 
plasmodesmata. 

By yeast two-hybrid analysis we demonstrated that Cmul can 
dimerize (Supplementary Fig. 15a), a property characteristic for 
AroQ chorismate mutases*. In addition, Cmul could interact with 
the two maize chorismate mutases ZmCm1 (B6TU00) and ZmCm2 
(B4FUP5) (Supplementary Fig. 15a). Despite low overall sequence 
conservation, known residues essential for chorismate mutase activity 
were conserved in all these enzymes (Supplementary Fig. 15b). 
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Figure 2 | Cmu1 is translocated to plant cells and spreads to neighbouring 
tissue. A, A maize section infected with U. maydis SG200Acmul-cmul-HA 
was collected 3 days after infection, probed with mouse anti-HA antibodies and 
detected with anti-mouse antibodies conjugated to 12-nm gold particles 
(Methods). Electron micrographs visualize Cmu1-HA inside fungal hyphae, in 
the biotrophic interface and in the cytoplasm of infected maize cells. Leaves 
infected with $G200 and $G200 P-muiGFP-HA served as negative controls 
(Supplementary Figs 8 and 9). fc, fungal cytosol; few, fungal cell wall; bi, 
biotrophic interphase; pc, plant cytosol; pew, plant cell wall; ppm, plant plasma 
membrane. Scale bars, 1 tm. B, Electron micrographs of immunogold-labelled 
sections were analysed for the spatial distribution of gold labels in 
S$G200Acmul-cmul-HA (blue) and $G200 PonyjGFP-HA (red) infected 
tissue 3 days after infection. The total number of gold labels in each electron 
micrograph was set to 100% (see Methods for details). Error bars, s.d. of gold 
particles counted in three independent cross-sections. C, Confocal Z-stacks 
visualize spreading of Cmul,,_299-mCherry to neighbouring tissue after 
biolistic transformation of maize leaves. White arrows indicate the originally 
transformed maize cells that carry a gold particle in their nucleus (a- 

c). Spreading of the fluorescent signal was observed in some cases for Cmul- 
mCherry (a) and not in others (b, c). Yellow arrows mark Cmu122_299- 
mCherry signals in nuclei of neighbouring cells (a). Cmu1 _399—mCherry 
spreading was never detected in transformed guard cells (c). Scale bars, 40 jim. 
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ZmCml encodes a predicted plastidic isoform (Supplementary Fig. 16a) 
whereas ZmCm2 codes for a putative cytoplasmic enzyme (Sup- 
plementary Fig. 15b). Localization to the respective compartments was 
demonstrated by transient expression in maize leaves (Supplementary 
Fig. 16b, c). The observed compartmentalization mimics what has been 
described for the chorismate mutases of Arabidopsis thaliana’’. 
Furthermore, as shown for the cytoplasmic isoform of chorismate 
mutase in A. thaliana’, ZmCm2 displayed enzymatic activity but no 
allosteric regulation in vitro (Supplementary Fig. 15c). 

To demonstrate that the interaction between Cmul and the cytosolic 
maize chorismate mutase ZmCm2 can have functional consequences, 
we attempted to show that Cmul could alter ZmCM2 activity 
in vitro. As this was unsuccessful (Supplementary Fig. 15d), we next 
generated a loss of function allele of cmu1 based on catalytically 
inactive forms of S. cerevisiae Aro7p (Supplementary Fig. 1a)”°”’. 
We reasoned that heterodimer formation between active and inactive 
monomers might interfere with chorismate mutase function. As 
expected, the cmulpig3a,.xi944 allele was unable to complement 
the aro7 deletion in S. cerevisiae (Fig. la). Surprisingly, when 
cmulpig3a,Ki94a Was introduced in single copy in either CL13Acmul 
or SG200Acmu1, virulence was completely abolished (Supplementary 
Fig. 17). When cmu1pis34,K1944 Was introduced in SG200 harbouring 
a functional cmu1 allele, the mutated allele had a dominant effect that 
was copy number dependent (Supplementary Fig. 17). Confocal 
microscopy of infected leaf tissue revealed that the SG200Acmul- 
cmulgig3a.Ki94a Strain could form appressoria (Supplementary Fig. 18a) 
but failed to colonize the plant (Supplementary Fig. 18b). In con- 
trast to SG200 infections, plant cells infected with SG200Acmul- 
cmulgig3a,xi9aa Were heavily stained by propidium iodide and 
displayed strong autofluorescence, probably because of the formation 
of phenolic compounds (Supplementary Fig. 18). This indicates that 
SG200Acmul1-cmulgig34,K1944 elicits a strong plant defence response. 

To exclude the possibility that the non-functional secreted chorismate 
mutase might interfere with the endogenous fungal shikimate pathway, 
we generated SG200Acmul derivatives that express cmu1pi34,K194A 
under control of a strong constitutive promoter (SG200Acmul- 
PotercmUlRig3a,Ki94a-HA). Western blot analysis confirmed that the 
mutant protein was produced and secreted (Supplementary Fig. 19a). 
These strains did not show a growth phenotype on minimal media 
lacking aromatic amino acids, were morphologically indistinguishable 
from SG200 during growth in minimal media, and were unaltered in 
filamentous growth on charcoal media, a prerequisite for successful 
infection (Supplementary Fig. 19b-e). This illustrates that the secreted 
Cmulgig3a,K194a—HA protein does not interfere with the activity of the 
cytoplasmic Aro7 protein in U. maydis. Cmulgig34Ki194a-mCherry- 
HA accumulated around biotrophic hyphae like other secreted effec- 
tors’ (Supplementary Fig. 20b) but was unable to cause a domi- 
nant negative virulence phenotype when expressed in SG200Acmu1 
(Supplementary Fig. 20a). This suggests that plant uptake of 
Cmul1pis3a,K104a-mCherry—-HA is necessary for the dominant effect 
on virulence, presumably because activity of ZmCM2 is affected. To 
obtain evidence that Cmulgig3axi9aa-HA can reduce ZmCm2 
activity, we first showed that ZmCm2 was able to interact with 
Cmulgig3a.Ki014a and could complement the aro7 mutation in 
S. cerevisiae (Supplementary Fig. 21). Next, zmcm2 was co-expressed 
with cmul or cmulgjs3a,xi9aa in the yeast aro7 mutant strain. 
Although the co-expression of zmcm2 and cmu1 had no detectable 
effect on growth, co-expression of zmcm2 and cmulpig34,K1944 
attenuated growth on plates lacking phenylalanine and tyrosine (Sup- 
plementary Fig. 21b). Therefore, the dominant negative effect on viru- 
lence elicited by the cmu1p1834,K194a-HA allele is probably caused by 
interfering with the activity of cytosolic ZmCm2 through dimerization. 
This also implies that orphan cytosolic plant chorismate mutases 
might have an important regulatory function. 

To obtain a comprehensive view on the metabolic changes in plants 
infected with CL13, CL13Acmul and CL13Acmul-cmul-HA, 


LETTER 


metabolome analyses were conducted 8 days after infection (Sup- 
plementary Figs 22 and 23 and Supplementary Table 2). Compared 
with mock-infected maize, plants infected with CL13 showed 
enhanced levels for phenylpropanoid and lignan biosynthesis products 
as well as for benzoxazinones, which derive from tryptophan (Sup- 
plementary Fig. 22 and Supplementary Table 3). For plants infected 
with CL13 and CL13Acmul, the most notable differences concerned 
the phenylpropanoid pathway (Supplementary Fig. 22). Substances 
such as coumaroyl- and caffeoylquinate and syringine as well as lignan 
(like the syringaresinol-glucosides) were less abundant in tissue 
infected with CL13Acmul than in plants infected with either CL13 
or the complemented strain CL13Acmul-cmul-HA (Supplementary 
Fig. 22b and Supplementary Tables 2 and 3). In contrast, the amount of 
salicylic acid was at least ten times higher in plants infected with 
CL13Acmul1 than those infected with the parental strain CL13 or 
CL13Acmul-cmul-HA, respectively (Fig. 1c). The amounts of the 
tryptophan-derived benzoxazinones were not significantly different 
in CL13Acmul and CL13 infections (Supplementary Fig. 22), indi- 
cating that the pathway from chorismate to tryptophan through 
anthranilate synthase is unaffected by Cmul activity. The underlying 
mechanism for this differential effect awaits further study. Our results 
support a situation in which Cmul channels chorismate into the phe- 
nylpropanoid pathway and prevents its flow into the salicylic acid 
biosynthesis branch. 

To elucidate the biological significance of the elevated salicylic acid 
levels in CL13Acmu 1 infections, maize seedlings were treated locally 
with 4mM salicylic acid before infection or co-infiltrated during the 
infection with CL13. This concentration was chosen on the basis of 
total salicylic acid levels determined in CL13Acmul1-infected plants. 
Both treatments led to a reduction in virulence comparable to 
CL13Acmu1 infections (Supplementary Fig. 24), which illustrates that 
salicylic acid enhances resistance of maize towards U. maydis. The data 
imply that the observed decrease in virulence for CL13Acmu1 could be 
a direct consequence of its inability to interfere with pathogen-induced 
salicylic acid biosynthesis of the host plant. 

Our findings provide new insights into a process that aids U. maydis 
during colonization of maize plants. It relies on the secretion of a 
chorismate mutase which enters plant cells by an unknown mech- 
anism and redirects the metabolome in favour of the parasite. 

We propose that the translocated fungal enzyme acts in conjunction 
with ZmCm2 in the plant cytosol by increasing the flow of chorismate 
from the plastid to the cytosol and in turn lowering the available 
substrate for salicylic acid biosynthesis in plastids (Fig. 3). However, 
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Figure 3 | Model of Cmul-mediated metabolic priming in infected maize 

tissue. An infecting fungal hyphae is depicted in yellow. Maize cells are shown in 
mint green, the plastid is depicted in darker green. The dotted line indicates that 
prephenate or prephenate derivatives might be re-imported into the plastid or 
have regulatory capacity to feedback on plastidic synthesis of phenylalanine and 
tyrosine or derived phenolic compounds. SA, salicylic acid. For details, see text. 
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the introduction of a deregulated chorismate mutase into the host 
plant cytosol alone cannot explain all the metabolic changes observed 
in the U. maydis infected tissue (Supplementary Table 2 and Sup- 
plementary Fig. 18)**?. Thus, in line with its modest effects on viru- 
lence, we consider Cmu1 to be one component of a cocktail of effectors 
shaping the host metabolome. In this context it might not be coincid- 
ence that U. maydis has genes for two potential salicylate hydroxylases. 
In addition, organ-specific functions as described for several other 
U. maydis effectors cannot be excluded". 

The suppression of salicylic acid levels is likely to be particularly 
important for biotrophic pathogens and symbionts”’. In line with this 
we found genes encoding secreted chorismate mutases in many genomes 
of eukaryotic biotrophic plant pathogens and symbionts and several 
hemibiotrophic plant pathogens but only rarely in necrotrophic plant 
pathogens and fungal saprophytes (Supplementary Table 4). Recent 
findings indicate that the secreted chorismate mutase in the fungus 
Sclerotinia sclerotiorum might also represent a virulence factor (M. 
Dickman, personal communication). Metabolic priming by secreted 
chorismate mutases might thus emerge as a common strategy for host 
manipulation. 


METHODS SUMMARY 


The Methods section provides detailed information about all experimental proce- 
dures, including the following: (1) tables with details on oligonucleotides, plasmids, 
U. maydis and S. cerevisiae strains used or generated in this study; (2) details on the 
cloning strategies; (3) description of U. maydis mutant generation and their sub- 
sequent analysis; (4) links to bioinformatic tools applied in this study; (5) details for 
conducting quantitative real-time PCR analyses; (6) description of the yeast com- 
plementation assays; (7) description of yeast protein interaction assays; (8) details 
for conducting chorismate mutase activity assays; (9) the method to demonstrate 
protein secretion in U. maydis; (10) details on the transient expression in Zea mays; 
(11) confocal and electron microscopy methods; (12) details for metabolome and 
hormone analyses; and (13) protocol for the isolation and mass spectrometric 
analysis of secreted proteins of apoplastic fluids. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Generation of plasmids. Standard molecular cloning strategies and techniques 
were applied in this study~*. Most of the constructs were generated using Gateway 
technology (Invitrogen) after an insertion of the gene of interest into the pEntry4 
vector (Invitrogen) using Ncol and NotI restriction sites. Primers used in this 
study are described in Supplementary Table 5. Plasmids that were generated in 
this study are listed in Supplementary Table 6. 

Mutant generation and analysis. All U. maydis strains (Supplementary Table 7) 
were generated by gene replacement with PCR-generated constructs or by inser- 
tion of p123 derivatives into the ip locus as described” (Supplementary Table 7). 
At least three independent mutants were repeatedly tested for virulence on 7-day- 
old maize seedlings and disease was scored 12 days after infection following 
described protocols’. The widely used solopathogenic haploid strains SG200 
and CL13 differ in virulence owing to the presence of autocrine pheromone 
signalling in SG200 and its absence in CL13, respectively'®. Compared with the 
naturally occurring dikaryon, both strains show reduced virulence. Typical symp- 
toms caused by CL13 are depicted in Supplementary Fig. 3a. 

Bioinformatic analyses. Signal peptide prediction was performed with the pro- 
gram SignalP 3.0 (http://www.cbs.dtu.dk/services/SignalP/). Chloroplast transit 
peptides were predicted with the program ChloroP (http://www.cbs.dtu.dk/ 
services/ChloroP/). Sequence alignments were generated using CloneManager 
Suite 9.0 (www.scied.com). Hierarchical neural network was applied for prediction 
of the Cmu1 secondary structure at the Network Protein Sequence Analysis server 
(http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_nn.html). Domain 
analyses were performed with Smart and InterPro (http://smart.embl-heidelberg. 
de/; http://www.ebi.ac.uk/Tools/InterProScan/). 

Quantitative real-time PCR. RNA was extracted from sporidia grown in axenic 
culture as well as from infected maize plants at the indicated time points with the 
TRIzol method (Invitrogen), treated with DNase (Ambion) and subsequently used 
for cDNA synthesis. Quantitative real-time PCR reactions were conducted as 
described earlier’. All reactions were performed at least in biological triplicates. 
Relative cmu1 expression levels were calculated in relation to the values obtained 
for the constitutively expressed peptidyl-prolyl cis—trans isomerase gene (ppi) of 
U. maydis*° (Supplementary Table 5). 

Yeast complementation assay. Yeast strain Y054679 lacking the ARO7 gene was 
transformed with the corresponding pYES and pGad derivatives (Supplementary 
Table 6) using standard protocols (Clontech) and tested for growth on medium 
lacking phenylalanine and tryptophan as described previously’’. A compilation of 
all S. cerevisiae strains used in this study is provided in Supplementary Table 8. 
Yeast protein interaction assay. The genes encoding the proteins tested for 
interaction were cloned into pGBKT7 or pGADT7 vectors (Clontech; Supplemen- 
tary Table 6), generating in-frame fusions with a gene encoding the yeast GAL4 
binding and activation domain, respectively. Interaction was tested in S. cerevisiae 
AH109 (Clontech). Growth controls were performed on selective dropout media 
(SD) plates lacking only tryptophan and leucine to select for cells containing the 
correct plasmids. Protein interactions were assayed on high-stringency SD plates 
additionally lacking adenine and histidine. 

Chorismate mutase activity assays. A glutathione S-transferase (GST)-Cmu 1 2_299- 
HA fusion protein was produced in Escherichia coli BL21 containing plasmid 
pRset-cmu2-299-HA (Supplementary Table 6) and enriched by glutathione- 
affinity purification (GE Healthcare). Also, GST-ZmCm2 fusion protein and 
derivates of Cmul for the enzyme assay were made accordingly. After removal 
of the GST moiety using PreScission protease (GE Healthcare), chorismate 
mutase activity assays were performed”. After acidic conversion of prephenate 
to phenylpyruvate the reaction was basidified and extinction at 2 = 320 nm was 
measured. The increase in extinction was plotted against time (in minutes) to 
visualize the formation of phenylpyruvate. Error bars represent s.d. from three 
technical replicates. Purified GST protein was used as a negative control and 
respective values were subtracted from those obtained with Cmul 2_299-HA. 
Demonstration of Cmul secretion. U. maydis strain AB33 P.teecmul—-HA was 
generated by insertion of plasmid p123_otef:um05731-HA into the ip locus of 
AB33** (Supplementary Table 7). To analyse Cmul-HA secretion, material was 
collected 6h after induction of filamentous growth in medium containing 
nitrate**. Protein extracts of filamentous cells and culture supernatants (after 
precipitation with trichloroacetic acid) were subjected to western blot analysis 
with mouse-anti HA (Sigma) and mouse anti-o-tubulin antibodies (Oncogene). 
Biolistic transformation of Z. mays. For biolistic transformation” of 7- to 
10-day-old maize leaves, 1.6-j1m gold particles were coated with plasmid DNA 
coding for the indicated genes driven by the CaMVS35 promoter (Supplementary 
Table 6). Bombardment was performed using a PDS-1000/He™ instrument 
(BioRad) at 900 p.s.i. in a 27 Hg vacuum. Fluorescence was observed by confocal 
microscopy 2 days after transformation. 
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Confocal and electron microscopy. Confocal microscopy was performed with a 
LeicaSP5 confocal microscope as described*’. Wheat germ agglutinin/Alexa Fluor 
488 and propidium iodide stains were performed as reported”. Autofluorescence 
was detected at 2 = 415-460 nm. 

For immunogold labelling, infected leaf parts were cryofixed by high-pressure 
freezing (Bal-Tec HPM 010), freeze-substituted in 0.5% glutaraldehyde in acetone 
(containing 2% H,0), infiltrated with Lowicryl HM20 and ultraviolet-polymerized 
at —40 °C. Ultrathin sections were labelled for HA epitope detection using mouse 
anti-HA (Sigma H9658) and donkey anti-mouse 12-nm gold antibodies (Jackson 
715-205-150) and imaged in a Philips CM10 electron microscope at 60kV. 

The distribution of gold particles was determined semi-quantitatively as 

described*’. Micrographs of sectioned Z. mays samples from infections with 
$G200 PemuiCmul-HA and $G200 Penu;GFP-HA were selected and in each case 
three different hyphae were chosen. Gold particles on each of the micrographs 
were then counted and assigned to the plant cell cytosol, the biotrophic interface or 
the fungal cytosol. The proportional distribution in these compartments was then 
calculated as a percentage and the s.d. was calculated from the three different data 
sets for each sample. 
Metabolome analyses. For metabolite fingerprinting a section of the third leaf 
between 1 and 3cm below the injection holes was excised 8 days after syringe 
infection with U. maydis strains or water (mock control), respectively. For each 
replicate, 30-40 leaf sections were pooled. Plant material was homogenized under 
liquid nitrogen. Two or three biological replicates of control leaves and infected 
leaves (80 mg each) were extracted with methyl-tert-butylether/methanol**. The 
polar phase was dried under a nitrogen stream and the extracted metabolites 
resolved in 10 ul of methanol, 10 ul acetonitrile and 120 ul water. The metabolite 
analysis was performed by ultra-performance liquid chromatography (UPLC, 
ACQUITY UPLC System, Waters Corporation) coupled with an orthogonal 
time-of-flight mass spectrometer (TOF-MS, LCT Premier, Waters Corporation). 
For LC an ACQUITY UPLC BEH SHIELD RP18 column (1mm X 100mm, 
1.7 um particle size, Waters Corporation) was used at a temperature of 40°C, a 
flow rate of 0.2 ml min’ and with the following gradient for the analysis of the 
polar phase: 0-0.5 min 10% B, 0.5-3 min from 10% B to 28% B, 3-8 min from 28% 
B to 95.5% B, 8-10 min 95.5% B and 10-14 min 10% B (solvent system A: water/ 
formic acid (100:0.1, v/v); B: acetonitrile/formic acid (100:0.1, v/v)). The TOF-MS 
was operated in negative as well as positive electrospray ionization mode in W 
optics with a mass resolution larger than 10,000. Data were acquired by MassLynx 
software (Waters Corporation) in centroided format over a mass range of m/z 85- 
1,200 with scan duration of 0.5 s and an interscan delay of 0.1 s. The capillary and 
the cone voltage were maintained at 2,700 V and 30 V and the desolvation and 
source temperature at 350 °C and 80 °C, respectively. Nitrogen was used as cone 
(301h~!) and desolvation gas (8001h~!). For accurate mass measurement, the 
TOF-MS was calibrated with phosphoric acid 0.01% (v/v) in acetonitrile/water 
(50:50, v/v) and the dynamic range enhancement mode was used for data record- 
ing. All analyses were monitored by using leucine-enkephaline ([M + H]* 
556.2771 or [M—H] 554.2615 as well as its ‘°C isotopomer [M+ H]* 
557.2803 or [M—H] 555.2615, Sigma-Aldrich) as lock spray reference com- 
pound at a concentration of 0.5 ug ml ' in acetonitrile/water (50:50, v/v) and a 
flow rate of 30 ,lmin™’. The raw mass spectrometry data of all samples were 
processed using the MarkerLynx Application Manager for MassLynx software 
(Waters Corporation), resulting in two data sets. 

The toolbox MarVis (http://marvis.gobics.de33) was used for ranking, filtering, 
adduct correcting and combining the data as well as for clustering and visualiza- 
tion, respectively. An analysis of variance test was applied to extract a subset of 
high-quality marker candidates with a p value less than 1 X 10°. The filtered data 
sets were adduct corrected according to the following rules: [M+ H]", 
[M + Na]*, [M+ NH,]* for the positive and [M— H], [M+ CH,0,— H], 
[M + CH,0, + Na— 2H] " for the negative ionization mode. The combined data 
led to an overall data set of 810 marker candidates (Supplementary Table 2), which 
were used for clustering and visualization by means of one-dimensional self- 
organizing-maps and for database search. 

The identity of selected markers was confirmed by MS’ fragment information™, 

co-elution with identical standards or exact mass measurement (Supplementary 
Table 3). 
Salicylic acid measurements. For metabolite fingerprinting, a section of the third 
leaf between 1 and 3 cm below the injection holes was excised 8 days after syringe 
infection with U. maydis strains or water (mock control), respectively. For each 
replicate, 30-40 leaf sections were pooled. 

Total salicylic acid was extracted’ and identified by co-elution with an authentic 
standard using liquid chromatography—mass spectrometry. 

Proteome analysis of apoplastic fluids. To extract apoplastic fluids, maize 
seedlings were infected with a mixture of FB1 and FB2 (ref. 36). Two, four and 
six days after infection, infected areas were excised and apoplastic fluid was 
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collected*’. After precipitation with trichloroacetic acid, proteins were separated 
by 12% SDS-polyacrylamide gel electrophoresis, digested in gel’* after subdividing 
each lane into 11 equal parts and run on an Agilent 1100 nano-HPLC system 
(75-uum C18 column, 100-min gradients), coupled to an LTQ-FT mass spectro- 
meter (Thermo Scientific). The “Top-3-SIM’ acquisition method was used, as 
described*’. Spectra were processed by MSQuant“ and searched using Mascot 
against a decoy Zea/Ustilago protein database. Mass tolerance for the precursor 
ion was in all cases 5 p.p.m, and for fragment ions 0.5 Da; full trypsin specificity 
was required and two missed cleavages were allowed. The mean measurement 
mass deviation of precursor (peptide) ions was 0.96 p.p.m. with a standard devi- 
ation of 0.82. 
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DNA stretching by bacterial initiators 
promotes replication origin opening 


Karl E. Duderstadt', Kevin Chuang” & James M. Berger? 


Many replication initiators form higher-order oligomers that process host replication origins to promote replisome 
formation. In addition to dedicated duplex-DNA-binding domains, cellular initiators possess AAA+ (ATPases associated 
with various cellular activities) elements that drive functions ranging from protein assembly to origin recognition. In 
bacteria, the AAA+ domain of the initiator DnaA has been proposed to assist in single-stranded DNA formation during 
origin melting. Here we show crystallographically and in solution that the ATP-dependent assembly of Aquifex aeolicus 
DnaA into a spiral oligomer creates a continuous surface that allows successive AAA+ domains to bind and extend 
single-stranded DNA segments. The mechanism of binding is unexpectedly similar to that of RecA, a homologous 
recombination factor, but it differs in that DnaA promotes a nucleic acid conformation that prevents pairing of a 
complementary strand. These findings, combined with strand-displacement assays, indicate that DnaA opens 
replication origins by a direct ATP-dependent stretching mechanism. Comparative studies reveal notable commonalities 
between the approach used by DnaA to engage DNA substrates and other, nucleic-acid-dependent, AAA+ systems. 


All organisms depend on ring- and spiral-shaped ATPase assemblies 
to carry out essential processes ranging from proteolysis and mem- 
brane trafficking, to signalling events and nucleic acid transactions. 
DNA replication onset in cells reflects one such process, using ATP- 
dependent initiation factors to coordinate replisome assembly’”. 
Replication initiators of eukaryotes and prokaryotes contain 
AAA+-family ATPase domains, the activity of which is augmented 
by duplex-DNA-binding domains and specialized protein-protein 
interaction elements that assist with origin recognition and recruit 
specific replication factors**. Although all AAA+ enzymes share a 
common structural core with RecA-type ATPases, together forming 
the additional strand catalytic glutamate (ASCE) supergroup of 
P-loop NTPases’, the molecular logic that allows a common nucleo- 
tidyl-hydrolase module to control the disparate activities of replica- 
tion initiators, and ASCE proteins in general, is not understood. 

In bacteria, replication initiation relies on the DnaA protein®*. In 
Escherichia coli, multiple DnaA molecules bind to the replication 
origin, oriC, through several duplex DNA-binding sites, forming a 
large nucleoprotein complex in the presence of ATP®"'’. With the aid 
of appropriate architectural proteins (such as integration host factor) 
and negatively supercoiled DNA, this complex subsequently melts an 
(A+T)-rich, DNA-unwinding element (DUE) located adjacent to the 
duplex DnaA binding sites'*’’. ATP also activates a secondary DNA- 
binding site within DnaA, thought to reside within the AAA+ 
domain, which engages single-stranded regions of the DUE to form 
a stable open complex’*'*"'°. DnaA then collaborates with the bacterial 
helicase loader (DnaC in E. coli) to recruit two hexamers of the DnaB 
helicase to the origin and promote replisome assembly’. 

Although most AAA+ enzymes form closed-ring assemblies 
structural studies have indicated that initiators and polymerase 
clamp-loaders form open-ring structures’***, Among initiator/loader 
systems, DnaA is particularly unusual in that it has been seen to oligo- 
merize into a right-handed, spiral filament'*. Two models have been 
proposed to explain how this structure might aid origin melting 
(Supplementary Fig. 1). In one, the wrapping of duplex DNA about a 


20,21 
> 


DnaA superhelix would constrain a positive supercoil, generating com- 
pensatory negative writhe that could aid opening of the neighbouring 
DUE. In the other, the wrapped DnaA-DNA complex would serve as a 
nucleation centre, allowing DnaA protomers to engage directly and melt 
the DUE, possibly through the initiator’s ATPase elements. Thus far, 
experimental evidence has supported both models”'*"'*”, leaving open 
the question as to how DnaA catalyses origin melting. The relationship 
of this mechanism to other initiation systems, or to AAA+/ASCE 
proteins overall, is also unclear. 


A DnaA-ssDNA crystal structure 


To examine these issues, we set out to determine the structure of DnaA 
bound to single-stranded DNA (ssDNA). Using a truncation of Aquifex 
aeolicus DnaA consisting of the AAA+ and duplex-DNA-binding 
domains (which, like its E. coli counterpart'®'’, is active for both 
ATP-stimulated assembly and ssDNA binding’), we first grew DNA- 
free crystals in the presence of Mg”* and the non-hydrolysable ATP 
mimic AMPPCP™. DNA substrates were then soaked into these crystals 
under low-salt conditions (Methods). Data collection and phasing by 
molecular replacement revealed four DnaA protomers per asymmetric 
unit, arranged in a spiral configuration that propagates into a continu- 
ous protein helix by the action of crystal-symmetry elements (Fig. 1a, b), 
along with bound ssDNA. Of the multiple substrates screened 
(Methods), dAj, yielded the highest-quality density (Supplementary 
Fig. 2a, b), and served as the best target for model building and refine- 
ment. The final structure, containing a DnaA:AMPPCP:Mg*":dAj> 
stoichiometry of 4:4:4:1, was refined to an Rwork/Riree Of 24.9/26.8% at 
3.35 A resolution (Supplementary Table 1). 


DnaA-ssDNA interactions 

The overall arrangement of DnaA subunits in the helical assembly is 
highly similar toa DNA-free form reported previously (0.7 A root mean 
squared deviation between all Co. positions)". AMPPCP*Mg’* binds 
at the interface between neighbouring subunits, with the y-phosphate of 
AMPPCP coordinated by catalytic amino acids from pairs of adjoining 
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AAA-+ domains. Single-stranded DNA associates exclusively with the 
AAA-+ elements of the initiator, with each protomer binding three 
nucleotides of the dA,» strand (Fig. 1a). Almost all contacts are made 
through the phosphodiester backbone, exposing the DNA bases to 
solvent. Each trinucleotide segment adopts a B-form DNA conforma- 
tion (Supplementary Fig 3), with the bases between consecutive seg- 
ments separated by large (~10A) gaps that extend the substrate by 
~50% (Supplementary Table 2 and Supplementary Information). 

DnaA binds ssDNA using just two pairs of helices, «3/04 and «5/ 
«6, both of which line the central channel of the protein assembly 
(Fig. 1c). The geometry of these two elements creates a single conduit 
along the length of the DnaA superhelix that allows substrate to 
traverse consecutive DnaA protomers. Interestingly, helices %3/a4 
also comprise the initiator-specific motif, which both promotes fila- 
ment formation'*”” and distinguishes DnaA as a member of the ini- 
tiator clade of the AAA+ superfamily*°”’. 

DnaA uses a simple network of interactions to coordinate ssDNA. 
The initiator-specific motif forms a shelf for each trinucleotide, in 
which a conserved hydrophobic residue, Val 156, forms van der 
Waals contacts with the sugar and base of the first nucleotide in the 
triplet (Fig. 1d). The central phosphate of each trinucleotide is bound 
by the electropositive, amino-terminal helix dipole of «6 and hydro- 
gen bonded by Thr 191 (Fig. 1c, d). These contacts are flanked by two 
positively charged residues, Arg 190 and Lys 188, which make salt- 
bridge interactions with the phosphates of nucleotides 1 and 3, 
respectively. Notably, mutant initiators containing substitutions in 
these observed DNA-binding residues show reduced affinity for 
ssDNA in solution (Supplementary Fig. 4), confirming that the crys- 
tals captured a physiologically meaningful initiator state. Moreover, 
mutations of the same positions in E. coli DnaA (amino acids Arg 245, 
Lys 243 and Val 211) also disrupt ssDNA binding and origin melt- 
ing'’. Thus, the ssDNA engagement strategy seen here seems to be 
conserved across bacterial species. 


Structural similarities between DnaA and RecA 


In considering the assembly patterns of oligomeric ATPases, we were 
struck by the similarity of DnaA to one system in particular: the 
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Figure 1 | The ATPase pore of assembled DnaA 
binds ssDNA. a, Side view of the asymmetric unit, 
with DnaA subunits differentially coloured. Single- 
stranded DNA is displayed as red sticks. AMPPCP 
and Mg*", bound to chain A, are shown as spheres 
coloured by element and in magenta, respectively; 
AMPPCP+Mg”~ bound to chains B-D is occluded 
in this view. b, Side and top views of oligomerized 
DnaaA, reconstructed through crystal packing, 
showing 12 DnaA subunits and 3 strands of 
ssDNA. Colouring as in panel a. c, Side view of the 
DnaA tetramer with helices «3/04 and «5/06 
highlighted in orange and yellow, respectively. 
ISM, initiator-specific motif. Single-stranded DNA 
is shown as a transparent stick-and-surface 
representation coloured by element; phosphates 
are further highlighted as red spheres. d, Protein- 
DNA contacts. Protein chains B (left) and C (right) 
are displayed with the same colouring as in panel 
c. Single-stranded DNA is coloured by element. 


homologous recombination protein, RecA. Although the cellular 
functions of these two proteins are fundamentally different (catalysis 
of DNA strand-exchange reactions versus replication origin melting 
and coordination of replisome assembly), both RecA and DnaA are 
predicated upon an ASCE ATPase fold’’””’. Like DnaA, RecA (and its 
Rad51/RadA orthologues) forms a helical assembly that engages DNA 
with its pore regions***. These shared physical properties led us to 
undertake a more detailed comparison of RecA and DnaA. Of the 
multiple models available, the structure of a RecA oligomer bound to 
ssDNA”, representing the presynaptic complex formed during the 
initial stages of homologous recombination, is globally most similar to 
the DnaA state we observe (Fig. 2a, b). As with DnaA, RecA contacts 
DNA almost exclusively through the phosphodiester backbone, 
which sits in the interior of a positively charged filament pore. Each 
RecA protomer binds three nucleotides in a B-DNA conformation, 
with the base stacking between each triplet interrupted such that 
ssDNA is extended ~1.5-fold compared to a B-form duplex (Fig. 2c). 

RecA and DnaA also exhibit some interesting and significant dif- 
ferences. A visual examination of each triplet shows that RecA uses a 
more extensive network of contacts for engaging ssDNA than does 
DnaA (Fig. 2d, e), burying twice as much surface area per triplet 
(318 A? and 639 A? for DnaA and RecA, respectively). This difference 
derives largely from an additional B-hairpin in RecA that fills the gap 
between each triplet and reinforces each three-base stack*’. Moreover, 
whereas two of the three nucleotides within each RecA triplet (posi- 
tions 1 and 2) align well with those seen in DnaA, position 3 of the 
DnaA trinucleotide rotates away from the pore axis by ~50° (Fig. 2f). 
This difference skews consecutive DnaA triplets away from one 
another, disrupting the formation of a smoothly spiral arrangement 
as seen in RecA (Fig. 2c). 


DNA extension is ATP- and assembly-dependent 


The ability of RecA to stretch DNA to the extent observed crystallogra- 
phically has been amply substantiated by various methodologies*”**. 
Using these efforts as a guide, we set out to determine whether the DNA 
conformation that we observe bound to DnaA accurately represents 
the state of the substrate in solution. To accomplish this, we turned to a 
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Figure 2 | DnaA engages ssDNA in a manner similar to RecA. a, View of a 
DnaA-AMPPCP-ssDNA pentamer (consisting of one full tetramer, as well as 
chain A (DnaA,_) and its associated triplet from the adjacent asymmetric unit). 
AMPPCP+Mg”~ is shown as spheres coloured by atom; ssDNA as red sticks. 
b, View of a RecA-~ADP-AIF,-ssDNA pentamer (Protein Data Bank (PDB) 

accession 3CMW)**. ADP*AIF,*Mg”~ is shown as spheres coloured by atom; 


bulk-phase fluorescence resonance energy transfer (FRET)-based 
ssDNA extension assay analogous to single-molecule approaches 
applied to RecA’’. Using a poly-thymine DNA labelled with Cy3 and 
Cy5 (FR-dT>,) (Supplementary Table 3), we monitored changes in the 
length of ssDNA resulting from DnaA binding (Fig. 3a). Analogous 
studies were performed with RecA as a control. As both RecA and 
DnaA require ATP for formation of the oligomers observed in the 
structural models, we expected ATP-dependent extension to lead to 
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Figure 3 | DnaA extends ssDNA in solution. a, Cartoon of ssDNA extension 
assay. b, Emission scan (donor excitation) of FR-dT>, in the presence of 10 uM 
DnaA with either ADP*BeF; (top) or ADP (bottom). c, Emission scan (donor 
excitation) of FR-dT>, in the presence of 10 1M RecA with either ATPyS (top) 
or ADP (bottom). Reported transfer efficiencies and distances were calculated 
using donor emission as described in Methods. a.u., arbitrary units. 


ssDNA as red sticks. c, Comparison of ssDNA bound to DnaA (orange), RecA 
(green) and a strand of B-DNA (yellow). d, Close-up view of triplet bound to 
DnaA (chain C) with magenta dashed lines indicating key contacts. e, Close-up 
view of triplet bound to RecA (protomer 2) with magenta dashed lines 
indicating key contacts. f, Side (left) and top (right) views of the triplets 
displayed in d and e aligned with each other. 


a loss of FRET signal. We tested for extension both in the presence of 
the ATP analogues ATPyS and ADP¢BeF;, to avoid complications that 
might arise from nucleotide hydrolysis, and in the presence of ADP, 
which is known to promote DnaA disassembly. Pronounced extension 
was observed only in the presence of the ATP analogues (Fig. 3b, c), and 
not with ADP. The lengths of ssDNA in the ATP-assembled states of 
both proteins, as calculated from the FRET data, were in close agree- 
ment with those observed in the crystal structures (Supplementary 
Table 6). Likewise, mutations in ssDNA-binding amino acids and 
residues required for DnaA assembly all significantly reduced 
ssDNA extension (Supplementary Fig 6), demonstrating that this 
activity depends on substrate binding to the pore of an initiator 
oligomer that forms only when activated by ATP. 


DnaA directly catalyses duplex melting 

How replication origins are opened for replisome assembly is an import- 
ant, unanswered question. Given the similarities between the ssDNA 
binding and extension activities of DnaA and RecA, we reasoned that 
the initiator might directly destabilize and disrupt DNA duplexes. This 
activity is a known property of RecA*’, albeit one that permits the 
recombinase to exchange DNA strands between target substrates 
actively”. 

To test this idea, we developed a DNA strand-displacement assay for 
DnaA. First, the initiator was incubated with a short duplex containing 
one fluorescently labelled strand. Unlabelled competitor strand was 
then added to capture any unwound species (Fig. 4a). Both ADP and 
ADP*BeF; were tested to determine whether initiator assembly affected 
the outcome of the experiment, as were DNAs of different lengths and 
stabilities. Analysis of the resultant products by gel electrophoresis 
shows that DnaA readily unwinds a 15mer duplex DNA of moderate 
stability (T,, = 43 °C) in the presence of the ATP mimic (Fig. 4b). By 
contrast, increasing the stability of the DNA substrate by ~30% (using a 
20mer, Tm = 55 °C) weakens the unwinding activity of DnaA (Fig. 4b), 
while increasing DNA stability even further (30mer, T,, = 62 °C) 
abrogates melting completely (Supplementary Fig 7a). Importantly, 
ADP did not support strand displacement, nor did ssDNA binding 
and DnaA assembly mutants (Supplementary Fig. 7b, c). These con- 
trols indicate that double-stranded-DNA melting is dependent not 
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Figure 4| DnaA directly melts duplex DNA. a, Schematic of strand 
displacement assay. The green circle represents the Cy3 fluorescent end-label 
used to follow the status of one DNA strand. Complementary strands of duplex 
substrates are coloured grey and black. b, Strand displacement assay conducted 
with 15mer and 20mer duplex substrates (C3-15mer and C3-20mer) in the 
presence and absence of different nucleotides. DnaA concentrations used are 
indicated above each lane. c, Left: cartoon model showing how complementary 
base triplets (yellow) would pair (in a B-DNA manner) with ssDNA bound to 
DnaA (red). The orientation of successive DnaA-bound triplets is such that it 
prevents the formation of a continuous base-paired strand favouring duplex 
separation. Right: same DNA view, but as seen in RecA, where triplets are 
oriented to allow pairing of an extended complementary strand to promote 
duplex formation and strand exchange (PDB accession 3CMX)”’. 


only on formation of an assembled DnaA oligomer, but that the ini- 
tiator is fine-tuned to specifically disrupt DNAs of modest stability. 
One significant functional difference between RecA and DnaA is 
that the recombination protein can drive a true strand-exchange 
reaction; that is, in addition to displacing one strand of a duplex, 
RecA can also pair homologous ssDNA segments into a double- 
stranded molecule. By contrast, the function of DnaA is to separate 
double-stranded origin regions. Inspection of the RecA and DnaA 
complexes reveals a physical basis for these differing properties: in 
DnaA, successive trinucleotide elements are arranged in a state 
incompatible with the formation of a continuous duplex, whereas 
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Figure 5 | Common DNA recognition strategies of AAA+ proteins. 
Structures of DNA-bound assemblies (top) and individual domains (bottom) 
for AAA+ proteins involved in replication. All recognize DNA using the same 
face of the AAA+ fold (violet; bottom). a, Bacterial clamp-loader (/55’) 
complex (AAA+ domains, differentially coloured) bound to primer-template 
DNA (PDB 3GLF)"". b, Archaeal initiators Orcl-1 (grey) and Orcl-3 (AAA+ 
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ssDNA bound to RecA adopts a smoothly spiralled arrangement per- 
mitting the contiguous pairing of a complementary strand (Fig. 4c). 
This distinction arises primarily from the 50° rotation between the 
nucleotides at the third position of each triplet seen in the RecA and 
DnaA models (Fig. 2f). In DnaA, the orientation of this nucleotide 
appears to be stabilized by base stacking, whereas in RecA the B-hair- 
pin insertion helps to sculpt the configuration of the DNA to create a 
contiguous base-pairing surface. 


Implications for origin melting 
Together, our findings present the strongest evidence yet that DnaA 
melts replication origins by directly assisting with the separation and 
sequestration of duplex DNA strands (Supplementary Fig. 1c). 
Notably, this activity does not contradict the demonstrated need for 
other factors capable of reshaping and/or destabilizing DNA (for 
example, integration host factor and negative supercoiling) during ini- 
tiation’*"*. Rather, these elements probably help to promote DnaA 
assembly and prime the origin for melting by what otherwise would 
be an inefficient unwindase. In this view, the AAA+ domains of DnaA 
may first engage only one of the two strands of duplex DNA with their 
ssDNA binding elements (possibly at reported ssDNA or ATP-DnaA 
binding sites’®'®). In the presence of ATP, which triggers initiator 
assembly, subunit-subunit interactions would help to restructure the 
DNA backbone, stretching the contacted strand to facilitate melting. 
Re-annealing would be disfavoured by the non-contiguous arrange- 
ment of base triplets in the extended state (Fig. 4c). Future studies will 
be needed to define the specific order and effect of these events further. 
We envision that the propensity of DnaA to open DNA could be 
adjusted in other bacterial species by strengthening or weakening the 
association of its ATPase domains with DNA and/or each other. An 
attractive feature of such a mechanism is that it is amenable to addi- 
tional layers of control by changes to DUE sequence, superhelical den- 
sity and co-resident architectural factors to ensure that a replication 
origin fires only when DnaA is both present and assembled properly. 
Such flexibility may have had a role in allowing DnaA to persist as the 
primary initiator in bacteria that have adapted to markedly different 
environmental niches. 


Thematic patterns of substrate recognition in AAA+ 
ATPases 

The mechanism by which DnaA coordinates ssDNA also comports 
well with findings in other replication initiation systems and with 
ASCE ATPases in general. For instance, many oligomeric RecA and 
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domain, green) bound to origin DNA (PDB 2QBY)”*. ¢, Bacterial initiator 
DnaA (AAA+ domains, grey/blue) bound to ssDNA. d, Viral initiator/helicase 
E1 (AAA+ domains, orange/grey) bound to ssDNA (PDB 2GXA)”°. For all 
panels, DNA is shown as either red spheres (top) or as a red/grey cartoon 
(bottom). Nucleotide co-factors bound to AAA+ domains (bottom) are 
represented as spheres coloured by atom. 
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AAA+ enzymes bind substrate in the interior pore of a closed- or 
cracked-ring particle***’. DnaA follows this pattern. A comparison 
of DnaA to other, disparate nucleic-acid-dependent AAA+ systems— 
for example, polymerase clamp-loaders and processive helicases— 
further shows that these factors also associate with client substrates 
in aremarkably analogous manner, using the same face of the core «Ba 
ATP-binding fold to engage a short backbone stretch of their target 
DNAs (Fig. 5). For AAA + proteins involved in initiation, these similar 
contact mechanisms have been differentially co-opted to assist with 
specific protein functions, ranging from the control of origin recog- 
nition (as seen in archaeal Orcl proteins****) to mediating processive 
DNA unwinding (viral superfamily 3 helicases***’). DnaA, with its 
ability to melt (but not translocate along) DNA, seems to use an intri- 
guing mix of some of the activities exhibited by related initiation sys- 
tems. Future efforts will be needed to determine how subtle differences 
in the position and nature of substrate-binding surfaces, combined 
with specific alterations in the assembly patterns of central AAA+ 
domains, endow such molecular motors and switches with their dis- 
tinct biochemical properties. 


METHODS SUMMARY 


Detailed information regarding experimental methods, substrate sequences, bind- 
ing constants, and FRET efficiencies and distances can be found in the Methods 
and in Supplementary Information. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Expression and purification of DnaA. Residues 76-399 of A. aeolicus DnaA 
(containing the AAA+ and duplex-DNA-binding regions) were expressed as a 
TEV-protease cleavable Hiss~MBP fusion and purified as previously described”. 
As a final purification step, untagged DnaA proteins (from TEV cleavage) were 
run over an S-200 size-exclusion column (GE) in gel-filtration buffer (50 mM 
HEPES pH7.5, 500mM KCl, 10% (v/v) glycerol, 5mM MgCl, 100 uM ADP). 
Monomeric species were pooled, concentrated and flash-frozen for storage at 
—80°C. For mutagenesis studies, changes were introduced into the Hisg- 
MBP-DnaA construct using QuickChange (Stratagene). 

Crystallization and DNA soaking. After gel filtration of DnaA in crystallization 
buffer (20mM HEPES pH7.5, 250mM KCl, 250mM KBr, 10% (v/v) glycerol, 
10 mM MgCl, 100 14M AMPPCP), monomeric species were pooled, concentrated 
to 10mgml ' at 4°C, and flash-frozen for storage at —80 °C. Crystallization by 
hanging-drop vapour diffusion was performed by mixing 1.3 il of freshly thawed 
DnaA in crystallization buffer and 11 of well solution (15-35mM sodium 
cacodylate pH 6.5, 26% 1,2-propandiol and 1-2% PEG 2000 MME) at 18°C. 
Large rod-like crystals appeared within 1 to 2 weeks and reached maximal size 
around 3 weeks. Crystals were transferred by looping to a low-salt soaking solution 
(20mM HEPES pH7.5, 30mM sodium cacodylate pH 6.5, 10% (v/v) glycerol, 
10mM MgCl, 26% 1,2-propandiol, 2.5% PEG 2000 MME and 200M 
AMPPCP) containing 5mM ssDNA. After 6h, crystals were looped and trans- 
ferred to a second drop of soaking solution containing 5mM ssDNA, and left 
overnight to ensure both complete removal of remaining salt and to allow time 
for binding. The crystals were then looped and flash frozen in liquid nitrogen in 
preparation for data collection. Previous biochemical studies revealed no apparent 
sequence preference for ssDNA by A. aeolicus DnaA”, so DNAs of various 
sequences and lengths were all individually tested (dT, (n = 3-12) and dA, 
(n = 3-12), Elim Biopharmaceuticals). Data collection and structure determina- 
tion revealed that dA,, generated the strongest electron density, although similar, 
albeit weaker and less connected, density was observed for dT oligonucleotides and 
smaller dA substrates. 

Data collection and structure determination. Data were collected at Beamline 
8.3.1 at the Advanced Light Source (ALS)“* and processed using HKL-2000*. 
Crystals belong to the space group P2;2;2), with dA» soaked crystals having unit 
cell dimensions a = 99.8 A, b = 114.2 A and c = 201.3 A (Supplementary Table 1). 
Data were phased using molecular replacement as implemented in PHENIX”, 
using a DNA-free DnaA tetramer as a search model (PDB 2HCB)"™. Initial F, — F. 
electron density maps containing clear density for DNA were generated using 
rigid-body and grouped B-factor refinement with PHENIX”. Further refinement 
was conducted using multidomain, non-crystallographic-symmetry (NCS)- 
restrained, simulated annealing in PHENIX”, fourfold multidomain NCS 
averaging with a custom solvent mask (including the region of DNA binding), 
density modification using resolve’, and manual model building in COOT™. 
During the final stages of refinement, fourfold multidomain NCS and secondary 
structure restraints were retained for AMPPCP and the entire protein except 
residues 255-265, which differed between chains as a result of crystal packing 
interactions. Composite, simulated-annealing omit maps generated with CNS” 
were used as a guide for building with COOT. DNA and waters were manually 
added to the model, and final rounds of refinement with PHENIX were conducted 
with grouped B-factor modelling, as well as NCS restraints and TLS modelling 
of individual protein domains (comprising three TLS groups in total: the 
AAA+-core (amino acids 76-241) plus AMPPCP; the AAA+ «o-helical ‘lid’ 
(amino acids 242-254 and 266-241); and the duplex-DNA binding domain 
(amino acids 291-399)). All panels of figures with renderings of structures and 
electron density were prepared with PyMol”. 

The final model contains one DnaA tetramer bound to one dA; per asym- 
metric unit. A clear 5’ or 3’ break between successive dA,» substrates was not 
present in the electron density, indicating that during the soaking procedure, 
different DNA molecules bound in multiple registers to consecutive DnaA pro- 
tomers throughout the crystal. Accordingly, a terminal 5’ phosphate, which was 
not present in the substrate used for soaking, was added to the modelled dAj> 
DNA. 

Polarity of DNA binding. During refinement, DNA was initially modelled inde- 
pendently into the DnaA pore with each of the two possible polarities. Compared 
to the 5’ to 3’ polarity presented in the paper, refinement of the model with the 
dA,» substrate running 3’ to 5’ (from the arginine finger side to the nucleotide- 
binding face of a DnaA protomer) resulted in only marginally higher Ryo and 
Réree Values (~0.1%), but also the appearance of off-model positive difference 
density and on-model negative difference density in F, — F. maps (the model as 
presented in the paper did not display such features). Simultaneous refinement 
with two DNAs, each at half occupancy, with opposing polarities of the dAj2 


substrate resulted in ~0.3% higher Rwork and Rfree values, and again showed 
unfavourable difference density in F, — F. maps. 

Recognizing that these differences, although consistent with our build, were 
subtle and did not definitively resolve the ssDNA binding polarity to DnaA, we set 
out to test our assignment further. To this end, we designed and had synthesized 
(by Trilink BioTechnologies) two specialized, di-adenosyl nucleotide substrates 
that would give rise to a clear distinction in binding orientation: 5’-p(Br-A)pAp 
and 5’-p(e-A)pAp, where ‘Br-A’ indicates a bromo-deoxyadenosine label, “s-A’ 
indicates an etheno-deoxyadenosine label, and ‘p’ indicates a phosphate moiety. 
Soaking of crystals with these dinucleotide substrates was performed as described 
for ssDNA substrates (note that in our soaking trials with oligonucleotides as 
short as dA3, we observed density associated with DnaA protomers consistent 
with that seen for the trinucleotide repeats when using dA,,). Unfortunately, data 
collected with the ¢-A-substituted oligonucleotide yielded maps with density for 
dinucleotides bound to each monomer but additional density for the EthenoA 
was not clearly visible, probably due to the low ~3.4A resolution limit of the 
DnaA crystals. At the same time, SAD data sets collected with the Br-A- 
substituted oligonucleotide did not yield useful maps, due to the weak diffraction 
of the crystals (and accompanying radiation damage as we attempted to maximize 
data signal-to-noise at the bromine absorption maximum’), to incomplete 
bromine labelling, or both. We note that we carried out soaks with longer 
Br-dA-labelled (and Br-dU-labelled) oligonucleotides, but these efforts were 
not successful, again because of weak diffraction. Additional experiments to test 
the orientation (for example, using labelled oligo/protein pairs and FRET) were 
considered, but ruled out due to the small binding site size for substrate DNA, and 
an inability to find a suitable pair of labelling sites that could report on differing 
binding orientations. 

As a consequence, although our data are supportive of the polarity presented in 
our model, we cannot definitively rule out the possibility that ssDNA might also be 
binding to DnaA in the crystal in an opposing direction. Nonetheless, several 
findings support the idea that DnaA binds ssDNA in a defined orientation that is 
consistent with the direction suggested here. For example, following nucleoprotein 
complex formation on oriC, DnaA melts (A+T)-rich regions in the DUE”; two 
independent reports have found that E. coli DnaA binds specifically to only one 
strand (the so-called ‘top’ strand) of the DUE during this process’*”*. The import- 
ance of DnaA binding polarity becomes clear during the next stage of initiation, 
when the DnaB helicase is loaded. Modelling studies based on the known DnaB 
translocation polarity (5' to 3’) and known pairwise interactions between DnaB, 
DnaC and DnaA, have suggested that top-strand loading involves a direct inter- 
action between DnaA and DnaC that has been observed biochemically and depends 
on the AAA+ domains of the two proteins'®. Because AAA+ domains assemble 
with a defined orientation, in which the arginine finger face of one protomer points 
into the nucleotide binding face of a second subunit, it follows that DnaA molecules 
probably position themselves on the top strand with only one of their two AAA+ 
domain surfaces presented to DnaC. Although the polarity of the DnaA-DnaC 
interaction has not been established, a mutation on the arginine finger face of the 
E. coli DnaA AAA+ domain, R281A, is reported to disrupt helicase loading, but 
not oriC melting”; this finding indicates that DnaA interacts with DnaC using its 
arginine-finger face. In our structures, the 5’ end of the modelled DNA resides 
near the arginine finger face of DnaA, a configuration consistent with these data. 
ssDNA extension assay. Extension of dT, oligonucleotides labelled with Cy3 
and Cy5 (FR-dT2;) by DnaA was monitored by FRET using a FluoroMax-4 
(Horiba Jobin Yvon) spectrofluorimeter. Measurements were carried out at 
25°C in 20 ul with 25 nM of FR-dT2, and either 10 1M of DnaA in DnaA exten- 
sion buffer (50mM HEPES pH7.5, 125mM KCl, 2% (v/v) glycerol, 10 mM 
MgCl, and 2 mM ADP or ADP¢BeF;) or 10 tM of RecA in RecA extension buffer 
(25 mM Tris-acetate pH 7.5, 100 mM Na-acetate, 10 mM Mg-acetate, 1 mM DTT 
and 2mM ADP or ATPYS). Emission scans from 545 to 700 nm were collected 
with excitation of Cy3 at 530 nm, divided by the excitation intensity, and then 
corrected for the wavelength-dependent sensitivity of the detector. FRET effi- 
ciencies and distances were determined by comparing the Cy3 fluorescence from 
the doubly labelled substrate (FR-dT3,) with the Cy3 fluorescence from a sub- 
strate only having a Cy3 label (C3-dT>,) under the same conditions. 

Influence of proteins on dye behaviour. To ensure that all influences on dye 
behaviour were properly considered when processing the FRET data from the 
DNA extension assay (Fig. 3), the fluorescence and absorbance of each dye was 
monitored independently for each experimental condition. Emission and absor- 
bance scans of substrates labelled only with the Cy3 donor (C3-dT>,) were col- 
lected in buffer alone, and with protein in the presence of ATP mimics or ADP 
(Supplementary Fig. 8c (panels ii and iii) and d (panels ii and iii)). Emission scans 
revealed pronounced protein- and nucleotide-dependent enhancement of donor 
fluorescence, but negligible differences in donor absorbance. Similar, but less 
significant, effects have been observed previously for RecA at a concentration 
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of 14M (comparable to our working concentration of 101M)’. A similar 
enhancement in acceptor fluorescence (and lack of effect on acceptor absorbance) 
also was observed in the doubly labelled substrate, FR-dT2, (Supplementary Fig. 8c 
(panels iv and v) and d (panels iv and v)). These effects are not surprising, as the 
spectral properties of fluorescent dyes are known to undergo marked variation 
depending on chemical environment****. However, these controls also indicated 
that we needed to take into account additional corrections to obtain accurate 
distance measurements. In particular, the changes in donor fluorescence, but 
not donor absorbance, were indicative of changes in the donor quantum yield 
(®p), which is used to calculate Ro (A), the distance corresponding to a FRET 
efficiency of 50%: Ry =8.79 x 10-5(J2n4@p)'/® where J is the spectral overlap 
between the donor emission and acceptor absorption; x7 is a geometric factor that 
depends on the orientation of donor and acceptor; and n is the refractive index of 
the medium between donor and acceptor™. 

To determine the donor quantum yield under different experimental condi- 
tions, we used rhodamine 6G as a standard for calibration, with an assumed 
quantum yield of 0.95 in EtOH®. To calculate the quantum yields seen in 
Supplementary Table 5, we collected the fluorescence and absorbance of the 
donor-only labelled substrate (C3-dT>;) in the RecA and DnaA buffers. We then 
used these data to determine ratios between the integrated fluorescence and 
absorbance, while correcting for the fractional absorbance at the excitation wave- 
lengths used. Fluorescein in 0.1M NaOH (known to have a quantum yield of 
0.95 (ref. 56)) was also measured as a control. To ensure reliable readings, all 
absorbance measurements were conducted with 1 1M of the donor-only labelled 
substrate, either alone or in the presence of 10 1M of the indicated protein. All 
emission measurements were conducted with 25 nM of the donor-only labelled 
substrate, either alone or in the presence of 10 {tM of the indicated protein. Because 
the presence of protein had no influence on dye absorbance (Supplementary Fig. 8c 
(panels iii and v)), the quantum yield of the donor in the presence of different 
proteins was determined simply by using its value in buffer, and multiplying by the 
observed changes in fluorescence. Ro values were then calculated for each sample 
using the corresponding values for quantum yield. 

Determination of FRET efficiencies and DNA length. To determine the effi- 
ciency of transfer (E) from the FRET data collected using the DNA extension 
assay (Fig. 3 and Supplementary Figs 6, 8 and 9), the emission of the donor from 
the donor-only labelled substrate (C3-dT ,, Fy) was compared to the emission of 
the donor from the doubly labelled substrate FR-dT2; (Fpa) under equivalent 


experimental conditions as follows: E=1— ave (ref. 57). The efficiencies for 


different samples can be found in Supplementary Table 5. Sorauott distances 


1-E 
(ref. 54), the 


were subsequently obtained using the relation R= Ro 


values of which can also be found in Supplementary Table 5. 

DNA strand displacement assay. The DnaA-dependent displacement of single 
strands from duplex DNA was monitored using a Cy3 label on one of two strands 
(the ‘bottom’ strand, Supplementary Table 3). All measurements were carried out 
at 25°C in 80 ul of binding buffer containing 50 mM HEPES pH7.5, 125mM 
KCl, 2% (v/v) glycerol, 10 mM MgCh, 0.1 mg ml! bovine serum albumin, 1 mM 
DTT and 2mM ADP or ADPsBeF; (a non-hydrolysable ATP analogue that 
mimics the properties of ATP**°*). After a short 2-min incubation of 25nM 
duplex DNA with various DnaA concentrations (Fig. 4), 50nM of unlabelled 
bottom strand was added for an additional 30 min to capture displaced top 
strands. After quenching with 10 stop buffer containing 200 mM EDTA, 10 mg 
ml ' proteinase K and 4% (v/w) SDS, displaced strands were separated on native 
polyacrylamide gels in Tris/boric acid/EDTA (TBE) buffer and visualized using a 
Molecular Dynamics Typhoon. The time dependence of DNA strand displace- 
ment by DnaA can be found in Supplementary Fig. 11. Sequences of substrates 
used can be found in Supplementary Table 3. 

ssDNA binding assay. Binding of 5’ fluorescein-labelled dT>5 oligonucleotides (F- 
dT»;, Supplementary Table 3) to DnaA was monitored by fluorescence polarization 


ARTICLE 


using a Victor 3V (Perkin Elmer) multi-label plate reader (Supplementary Fig. 4a). 
Measurements were carried out at 25°C in 20 ul of binding buffer containing 
50 mM HEPES pH 7.5, 125 mM KCl, 2% (v/v) glycerol, 10 mM MgCl,, 0.1 mg ml! 
bovine serum albumin, 1 mM DTT and 2mM ADPeBeF3. The concentration of 
F-dT,; was held constant at 10 nM while the concentration of DnaA was varied. All 
data points represent the average of three independent measurements, with error 
bars representing the standard deviation between measurements. Binding curves 
were fit to the Hill equation to obtain K4,4,) values (Supplementary Table 4) as 
described previously”. 

Oligomerization characteristics of ssDNA-binding mutants. To confirm that 
ssDNA binding mutations did not affect the ATP-dependent oligomerization 
properties of DnaA, we used a previously established glutaraldehyde-crosslinking 
assay*®. Crosslinking was performed by incubating 50,gml-' of various 
A. aeolicus DnaA proteins in 80 pl of a reaction buffer (50 mM HEPES pH7.5, 
10% (v/v) glycerol, 125mM KCl, 5mM MgCl, 2mM DTT) containing 2mM 
ADPeBeF; at 25 °C for 5 min. Glutaraldehyde (Polysciences Inc.) was then added 
to 1mM final concentration using 8.8 jl of a 10 mM stock. Reactions were incu- 
bated at 25°C for an additional 1 min before quenching with 8 ul of 200 mM 
glycine followed by the addition of 30 ul of gel loading buffer (100 mM Tris pH 
6.8, 24% (v/v) glycerol, 8% (w/v) SDS, 200 mM DTT, 0.02% (w/v) bromophenol 
blue). Crosslinked proteins were loaded in a volume of 15 ul and separated on 
denaturing 4.5% polyacrylamide gels (80:1 acrylamide:bisacrylamide) in 0.1M 
sodium phosphate, 0.1% SDS buffer (pH 7.2), and visualized by silver staining 
(Supplementary Fig. 4b). 
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Brain-machine interfaces” use neuronal activity recorded from the 
brain to establish direct communication with external actuators, 
such as prosthetic arms. It is hoped that brain-machine interfaces 
can be used to restore the normal sensorimotor functions of the 
limbs, but so far they have lacked tactile sensation. Here we report 
the operation of a brain-machine-brain interface (BMBI) that both 
controls the exploratory reaching movements of an actuator and 
allows signalling of artificial tactile feedback through intracortical 
microstimulation (ICMS) of the primary somatosensory cortex. 
Monkeys performed an active exploration task in which an actuator 
(a computer cursor or a virtual-reality arm) was moved using a 
BMBI that derived motor commands from neuronal ensemble 
activity recorded in the primary motor cortex. ICMS feedback 
occurred whenever the actuator touched virtual objects. Temporal 
patterns of ICMS encoded the artificial tactile properties of each 
object. Neuronal recordings and ICMS epochs were temporally 
multiplexed to avoid interference. Two monkeys operated this 
BMBI to search for and distinguish one of three visually identical 
objects, using the virtual-reality arm to identify the unique artificial 
texture associated with each. These results suggest that clinical 
motor neuroprostheses might benefit from the addition of ICMS 
feedback to generate artificial somatic perceptions associated with 
mechanical, robotic or even virtual prostheses. 

Brain-—machine interfaces (BMIs) have evolved from 1-d.f. systems* 
to many-d.f. robotic arms* and muscle stimulators’ that perform com- 
plex limb movements, such as reaching®* and grasping’. However, 
somatosensory feedback, which is essential for dexterous control'*”’, 
remains underdeveloped in BMIs. With the exception of a few studies 
combining BMIs with tactile stimuli applied to the body’, existing 
systems rely almost exclusively on visual feedback. Prosthetic sen- 
sation has been studied in the context of sensory substitution'’* and 
targeted reinnervation’; however, these approaches have limited 
application range and channel capacity. To provide a proof-of-concept 
method of equipping neuroprostheses with sensory capabilities, we 
implemented a BMBI that extracts movement commands from the 
motor areas of the brain while delivering ICMS feedback in somato- 
sensory areas'’*'® to evoke discriminable percepts'’”°. This idea 
received support from our pilot study’®, in which a monkey responded 
to ICMS cues with the movements of a BMI-controlled cursor. 
However, the ICMS cue did not provide feedback of object-actuator 
interactions in this previous demonstration. 

The BMBI developed here allowed active tactile exploration” during 
BMI control (Fig. la). Two monkeys (M and N) received multielec- 
trode implants in the primary motor cortex (M1) and the primary 
somatosensory cortex (S1) (Fig. 1b). They explored virtual objects 
using either a computer cursor or a virtual image (avatar) of an arm 
(Supplementary Fig. 1a, b). In ‘hand control’, the monkeys moved a 
joystick with their left hands to position the actuator. They searched 
through a set of virtual objects, selected one with a particular artificial 


texture conveyed by ICMS, and held the actuator over that object to 
obtain reward (Fig. la and Supplementary Fig. 1c, d). During ‘brain 
control’, the joystick was disconnected and the actuator was controlled 
by the activity of right-hemisphere M1 neurons””*”*. The behavioural 
tasks varied in the number of objects on the screen, the artificial tex- 
tures used and the actuator type (Fig. 2a), and were more difficult than 
previously reported BMI tasks because of the presence of multiple 
objects in the workspace, a prolonged object selection period and the 
necessity of interpreting ICMS feedback. 

ICMS was delivered through two pairs of microwires to the hand 
representation area of S1 in monkey M (Fig. 1c) and through one pair 
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Figure 1 | The brain-machine-brain interface. a, Movement intentions are 
decoded from M1; artificial tactile feedback is delivered to S1. b, Microwires 
were implanted in M1 and S1. c, Microwires used for ICMS in monkey M are 
accented in red. d, Actuator movements for a trial in which monkey M explores 
UAT but ultimately selects RAT. Grey bars indicate stimulation patterns; insets 
indicate the ICMS frequency. e, Rastergram of M1 neurons recorded during the 
trial shown in d. 
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of microwires to the leg representation in monkey N. Each artificial 
texture consisted of a high-frequency pulse train presented in packets 
at a lower, secondary, frequency (Fig. 1d and Supplementary Fig. 2a). 
The rewarded artificial texture (RAT) consisted of 200-Hz pulse trains 
delivered in 10-Hz packets. The comparison artificial textures com- 
prised 400-Hz pulse trains delivered in 5-Hz packets (unrewarded 
artificial texture (UAT)) or an absence of ICMS (null artificial texture 
(NAT)). 

The main challenge solved here was the real-time coupling of ICMS 
feedback to the BMI decoder. Because ICMS artefacts masked neur- 
onal activity for 5-10 ms after each pulse (Fig. 1d, e), we multiplexed 
neuronal recordings and ICMS with a 20-Hz clock rate (Supplemen- 
tary Fig. 2a). The interleaved intervals proved adequate for online 
motor control and artificial sensation—a result that was not clear a 
priori because S1 stimulation could have affected M1 processing 
through the connections between these areas. 

BMBI performance improved with training. In task I (Fig. 2a), 
monkey M surpassed chance performance after nine sessions and 
monkey N did so after four sessions (P< 0.001, one-sided binomial 
test). Improvement continued with more difficult tasks (tasks II-V) 
(Fig. 2a, b and Supplementary Fig. 3a). In particular, the time spent 
exploring unrewarded artificial textures decreased (Fig. 2c and Sup- 
plementary Fig. 3b). Additionally, performance improved over the 
course of daily experimental sessions (Fig. 2d). Psychometric analysis 
of RAT stimulation amplitudes (Supplementary Fig. 2b) indicated that 
at least 8 nC per ICMS waveform phase (100-j1s-wide current pulses of 
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Figure 2 | Learning to use ICMS feedback. a, Behavioural tasks. 2D, two- 
dimensional; VR, virtual reality. b, c, Performance of monkey M (72 sessions). 
Circles (b) depict the fraction of correctly performed trials. Open circles indicate 
chance performance. Curves are lines of best fit. The asterisk indicates sessions 
used for psychometric measurements. Squares, triangles and crosses 

(c) represent mean times spent in RAT, UAT and NAT, respectively. Black, hand 
control; red, brain control. d, Intrasession performance for monkey M. Curves 
represent averages for brain control without hand movements (BCWOH; main 
panel, three sessions), for hand control (inset, 12 sessions) and for brain control 
with hand movements (BCWH;; inset, 12 sessions). Lines are best linear fits. 
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80 uA) was needed for the discrimination of artificial textures 
(P < 0.001, one-sided binomial test). Performance was at chance level 
for catch trials (task IT), where ICMS was not delivered (P = 0.90, one- 
sided binomial test). 

The statistics of object exploration intervals (total time spent over a 
particular object in a given trial) indicated that the monkeys uniquely 
discriminated each type of artificial texture (Figs 2c and 3a, c) and 
interpreted ICMS within hundreds of milliseconds—a timescale com- 
parable to that for the discrimination of peripheral tactile stimuli**”’. 
Early in task I, exploration intervals were equal for RAT and NAT 
(P > 0.5, Wilcoxon signed-rank test); with training, they became longer 
for RAT and shorter for NAT (tasks I and II) and UAT (tasks III-V). 
During hand control, the mean interval was longest for RAT (monkey 
M: 1,396 + 21 ms; monkey N: 1,165 + 15 ms; mean + s.e.m.), shortest 
for NAT (304+8ms; 300+10ms) and intermediate for UAT 
(452 + 13 ms; 402 + 14ms) (P<0.01, analysis of variance). During 
brain control, intervals spent exploring NAT (498+ 15 ms; 
587 + 25 ms) and UAT (685 + 20 ms; 764 + 32 ms) were longer than 
they were during hand control, but were still shorter than those spent 
exploring RAT (1,420 + 28 ms; 1,398 + 55 ms) (P< 0.01, analysis of 
variance). 

Additional hallmarks of active exploration were seen in the con- 
ditional probabilities of selecting different artificial textures (Fig. 3b, 
d). During hand control trials, the monkeys stayed over the first- 
encountered artificial texture (arrows that loop back to the same 
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Figure 3 | Statistics of object exploration. a, Object exploration intervals 
during hand control and brain control (inset) for monkey M (hand control: 
n = 1,809 trials; brain control: n = 1,355 trials). b, State transition diagrams for 
monkey M, indicating the probabilities of reaching among RAT, UAT and 
NAT after the first (left subpanel) or second (right subpanel) reach. Black labels, 
hand control; red labels, brain control; line thickness is proportional to 
transition probability. c, Same as a, but for monkey N (hand control: n = 808 
trials; brain control: n = 729 trials). d, Same as b, but for monkey N. 
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artificial texture in Fig. 3b, d) with high probability if it was RAT 
(monkey M: P = 0.70; monkey N: P = 0.76), but with low probability 
ifit was UAT (P = 0.05; P = 0.01) or NAT (P = 0.0; P = 0.0) (Fig. 3b, 
d, left). After examining the second artificial texture, the monkeys 
could identify the correct artificial texture either by apprehending it 
directly or through a process of elimination. This follows from the 
increase from chance to approximately P = 0.7 in the probability of 
moving to RAT from NAT or UAT and the decrease to P ~ 0.2 in the 
probability of revisiting UAT or NAT (Fig. 3b, d, right). Similar effects 
were observed for brain control (Fig. 3b, d, red text). 

Brain control started in task IV. During BCWH, the monkeys 
continued to hold the joystick although it was disconnected’*”. 
During BCWOH”™”, the joystick was removed. In monkey M, with 
more than 200 recorded neurons, performance was less accurate dur- 
ing BCWH (73.75 + 3.00%; mean + s.e.m.) than during hand control 
(91.48 + 1.20%). In monkey N, with 50 recorded neurons, perform- 
ance dropped further (50.37 + 3.74% versus 91.45 + 1.91%), but still 
significantly exceeded the 33% chance level. M1 neurons showed 
directionally tuned modulations (Supplementary Figs 5 and 6) that 
were retained across different interfering ICMS patterns during both 
hand control (Supplementary Fig. 4a, b) and brain control (Fig. 4a, b). 
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Figure 4 | M1 modulations during active control versus passive observation. 
a, b, Average brain control movements for monkey M (n = 294 trials) towards 
the RAT, UAT and NAT objects appearing on the left-hand side of the screen 
(a), and corresponding neuronal modulations (b). The colour scale shows 
normalized firing rate (Hz). Only trials with subsequent anticlockwise reaching 
movements are included in the middle and right subpanels. Red vertical lines 
indicate object onset. Firing rate normalized by s.d. c, d, Position of the actuator 
actively controlled (c) and passively observed (d) by monkey M. SNR, signal-to- 
noise ratio. 
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In BCWOH, task requirements were eased: the object selection 
period was reduced to 300-500 ms and monkeys were allowed to 
overstay at an incorrect object. The performance of monkey M, mea- 
sured as the number of rewards per minute, steadily improved from 
1.021+0.007 to 2.962+0.005 (mean+s.em.; Fig. 2d). Similar 
improvements were observed for hand control and BCWH (Fig. 2d, 
inset). The average frequency of actuator displacements, calculated 
from power spectra, was correlated with the improvement in perform- 
ance during BCWOH (R* = 0.16 for the horizontal (x) coordinate and 
R? = 0.26 for the vertical (y) coordinate; P< 0.001, F-test), which 
indicated that the monkey modulated its brain activity to scan the 
targets faster. This behaviour was not random, as the exploration 
interval for NAT (3,620 + 350 ms; mean + s.e.m.) was significantly 
shorter (P<0.02, Wilcoxon rank-sum test) than for UAT 
(4,270 + 310 ms). The exploration of RAT (2,255 + 94ms) was the 
shortest owing to the reduced selection period. For monkey N, 
BCWOH performance (2.084 + 0.085 rewards per minute) did not 
change within sessions, and the differences in exploration intervals 
were not significant. 

In agreement with others*®°°, we observed that M1 neurons repre- 
sented the movements of the actuator even when it was passively 
observed by the monkey (Supplementary Fig. 7). Actuator movements 
(task V) replayed for the monkeys could be reconstructed from M1 
activity, using a separately trained decoder (Fig. 4d), with accuracy 
similar to that in reconstructions made for hand control (Fig. 4c). 
M1 representation of the passively viewed actuator is consistent with 
our suggestion that a neuroprosthetic limb might become incorpo- 
rated in brain circuitry’. 

Our BMBI demonstrated direct bidirectional communication 
between a primate brain and an external actuator. Because both the 
afferent and efferent channels bypassed the subject’s body, we propose 
that BMBIs can effectively liberate a brain from the physical con- 
straints of the body. Accordingly, future BMBIs may not be limited 
to limb prostheses but may include devices designed for reciprocal 
communication among neural structures and with a variety of external 
actuators. 
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METHODS SUMMARY 


All animal procedures were performed in accordance with the National Research 
Council’s Guide for the Care and Use of Laboratory Animals and were approved 
by the Duke University Institutional Animal Care and Use Committee. Two 
rhesus monkeys were implanted with microwire arrays in both brain hemispheres. 
These implants were used for both recordings and ICMS (symmetric, biphasic, 
charge-balanced pulse trains; 100-200 1s, 120-200 }1A). Monkeys manipulated a 
joystick to cause an actuator (computer cursor or a virtual-reality arm) to reach 
towards up to three objects displayed on a computer monitor. The task required 
searching for the single object with particular artificial tactile properties. Objects 
consisted of a central response zone and a peripheral feedback zone. Artificial 
tactile feedback was delivered when the actuator entered the feedback zone and 
continued in the response zone. Holding the actuator over the correct object for 
0.8-1.3 s produced a reward (fruit juice). Holding the actuator over an incorrect 
object cancelled the trial. In brain control trials, the actuator was controlled by 
cortical ensemble activity decoded using an unscented Kalman filter’’. An inter- 
leaved scheme of alternating recording and stimulation subintervals (50 ms each, 
50% duty cycle) was implemented to achieve concurrent afferent and efferent 
operations. In all offline analyses, ICMS periods were excluded from calculations 
of neuronal firing rates. The virtual-reality arm was animated using 
MOTIONBUILDER (Autodesk). 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Subjects and implants. Two adult rhesus macaque monkeys (Macaca mulatta) 
participated in this study. Each monkey was implanted with four 96-microwire 
arrays constructed of stainless steel 304. Each hemisphere received two arrays: one 
in the upper-limb representation area and one in the lower-limb representation 
area. These arrays sampled neurons in both M1 and S1. We used recordings from 
the right-hemisphere arm arrays in each monkey, because each manipulated the 
joystick with its left hand. Within each array, microwires were grouped in two four- 
by-four, uniformly spaced grids each consisting of 16 electrode triplets. The sepa- 
ration between electrode triplets was 1 mm. The electrodes in each triplet had three 
different lengths, increasing in 300-1m steps. The penetration depth of each triplet 
was adjusted with a miniature screw. After adjustments during the month following 
the implantation surgery, the depth of the triplets was fixed. The longest electrode in 
each triplet penetrated to a depth of 2mm as measured from the cortical surface. 

Tasks. The monkeys were trained to manipulate a computer cursor or a virtual- 
reality arm and to reach, using this actuator, towards objects displayed on a 
computer monitor. The objects were visually identical, but had different tactile 
properties as conveyed by ICMS of S1. In hand control, each trial commenced 
when the monkey held the joystick with their working hand. Then a target 
appeared in the centre of the screen. The monkey had to hold the actuator within 
that centre target for a random hold time uniformly drawn from the interval 0-2 s. 
After this, the central target disappeared and was replaced by a set of virtual objects 
radially arranged about the centre of the screen. Each of these consisted of a central 
response zone and a peripheral feedback zone, distinguished by their shading 
(Supplementary Fig. 1c). Tactile feedback was delivered in the feedback zone or 
the corresponding response zone. For monkey M, the radius of the response zone 
varied from 1.5 to 4.0cm and the radius of the feedback zone varied from 4.5 to 
7.25 cm, across all tasks and sessions. For monkey N, the radius of the response 
zone varied from 1.5 to 4.5 cm and the radius of the feedback zone varied from 4.75 
to 9.5 cm, across all tasks and sessions. A trial was concluded when the monkey 
placed the actuator within the response zone for a hold interval (800-1,300 ms for 
hand control, depending on the session; 300-500 ms for brain control) or the 
monkey released the joystick handle (in hand control trials). The next trial com- 
menced after an intertrial interval of 500 ms. 

The sequence of events was the same during brain control trials. In some brain 
control sessions, the joystick was removed from the behavioural set-up. For these, 
each new trial commenced following the previous intertrial interval without the 
requirement for the monkey to hold the joystick. In tasks I-III, monkeys chose 
from a set of two objects. In task I, the monkeys had to choose between RAT and 
NAT for fixed object locations. In task II, RAT and NAT were presented on the 
screen at different angular locations in each trial. In task III, object number and 
spatial arrangement were the same as in task II, but RAT and UAT were used. In 
task IV, three objects were used (RAT, UAT and NAT) and their arrangement on 
the screen varied from trial to trial. Finally, in task V, the virtual-reality monkey 
arm replaced the computer cursor. 
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Psychometric measurements. Psychometric measurements determined the 
minimum ICMS amplitude that the monkeys could discriminate (Supplemen- 
tary Fig. 2b). In these measurements, the ICMS amplitude was different in every 
trial. In each psychometric session, a range of amplitudes was selected such that 
about half were in a range clearly above the monkeys’ threshold for discrimination 
and half were in a range of unknown discriminability. 

Catch trials. In some sessions, a small percentage of trials (typically 1%) were 
designated as catch trials. In these trials, the microstimulator delivered pulse trains 
with zero amplitude, but all other aspects of the behavioural task remained the 
same. This allowed us to confirm that there were no unintentional sources of 
information that the monkeys could use to perform the tasks. 

Algorithms. An Nth-order unscented Kalman filter?* (UKF) was used for brain 
control predictions. Up to a tenth-order UKF was used in some sessions, but in 
most sessions we found that the third-order UKF was sufficient. The filter para- 
meters were fitted on the basis of either the hand movements of the monkeys while 
they performed the task using a joystick or on passive observation of actuator 
movements while the monkeys’ arms were restrained. 

ICMS. Symmetric, biphasic, charge-balanced pulse trains were delivered in a 
bipolar fashion across pairs of microwires. The channels selected had clear sensory 
receptive fields in the upper limb (monkey M: two pairs of microwires with 
synchronous pulse trains) or lower limb (monkey N: one pair of microwires). 
For monkey M, the anodic and cathodic phases of stimulation had a pulse width 
of 105 ts; for monkey N, the pulse width was 200 pts. The anodic and cathodic 
phases of the stimulation waveforms were separated by 25 1s. 

Interleaved ICMS and recordings. We implemented an interleaved scheme of 
alternating recording and stimulation intervals (Supplementary Fig. 2a). Our BMI 
had a 10-Hz update rate. That is, 100 ms of past neural data were used to make 
predictions about the desired state of the actuator. We broke up each 100-ms 
interval into two 50-ms subintervals. In the first subinterval (Rec), neural activity 
was recorded as usual and the measured spike count was used to estimate the firing 
rate for the whole 100-ms interval. The second subinterval (Stim) was reserved 
exclusively for delivering ICMS; all spiking activity occurring in this subinterval 
was discarded. Whenever the actuator was in contact with a virtual object at the 
start of a Stim interval, an ICMS pulse train was delivered. For RAT, nine pulses of 
ICMS were delivered; for UAT, 18 pulses of CMS were delivered; and for NAT, no 
pulses of ICMS were delivered. The neural activity in the Stim interval was dis- 
carded even for NAT, so that there would be no bias induced by ICMS-occluded 
neural data. 

Virtual-reality monkey arm. In task V, we introduced a novel, brain-controlled 
virtual-reality arm with realistic kinematic movements and spatial interactions. 
The control loop rate was 50 Hz, with visual refreshing at 30 Hz. The arm model 
was designed to depict a rhesus macaque. We presented a first-person perspective 
of the virtual-reality arm to the monkey, who controlled the position of the hand. 
Arm posture was controlled using a mixture of direct control of end effectors and 
inverse kinematics, constrained by the physical interdependencies of the joints. 


©2011 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


doi:10.1038/nature10491 


An endogenous tumour-promoting ligand 


of the human aryl hydrocarbon receptor 


Christiane A. Opitz!?*, Ulrike M. Litzenburger’*, Felix Sahm®*, Martina Ott'’, Isabel Tritschler*, Saskia Trump”, 

Theresa Schumacher!?, Leonie Jestaedt®, Dieter Schrenk’, Michael Weller*, Manfred Jugold®, Gilles J. Guillemin’, 
Christine L. Miller!®, Christian Lutz’, Bernhard Radlwimmer”, Irina Lehmann’, Andreas von Deimling®, Wolfgang Wick! 
& Michael Platten’ 


Activation of the aryl hydrocarbon receptor (AHR) by environmental xenobiotic toxic chemicals, for instance 
2,3,7,8-tetrachlorodibenzo-p-dioxin (dioxin), has been implicated in a variety of cellular processes such as 
embryogenesis, transformation, tumorigenesis and inflammation. But the identity of an endogenous ligand activating 
the AHR under physiological conditions in the absence of environmental toxic chemicals is still unknown. Here we 
identify the tryptophan (Trp) catabolite kynurenine (Kyn) as an endogenous ligand of the human AHR that is 
constitutively generated by human tumour cells via tryptophan-2,3-dioxygenase (TDO), a liver- and neuron-derived 
Trp-degrading enzyme not yet implicated in cancer biology. TDO-derived Kyn suppresses antitumour immune 
responses and promotes tumour-cell survival and motility through the AHR in an autocrine/paracrine fashion. The 
TDO-AHR pathway is active in human brain tumours and is associated with malignant progression and poor survival. 
Because Kyn is produced during cancer progression and inflammation in the local microenvironment in amounts 


sufficient for activating the human AHR, 


these results provide evidence for a previously unidentified 


pathophysiological function of the AHR with profound implications for cancer and immune biology. 


Degradation of Trp by indoleamine-2,3-dioxygenases 1 and 2 (IDO1/2) 
in tumours and tumour-draining lymph nodes inhibits antitumour 
immune responses’ ~ and is associated with a poor prognosis in various 
malignancies®. Inhibition of IDO1/2 suppresses tumour formation in 
animal models’ and is currently tested in phase I/II clinical trials in 
patients with cancer’. The relevance of Trp catabolism to human 
tumour formation and progression is, however, as yet unknown. 


TDO degrades Trp to Kyn in human brain tumours 


A screen of human cancer cell lines revealed constitutive degradation of 
Trp and release of high micromolar amounts of Kyn in brain tumour 
cells, namely glioma cell lines and glioma-initiating cells (GICs), but not 
human astrocytes (Fig. la). IDO1 and IDO2 did not account for the 
constitutive Trp catabolism in brain tumours (Supplementary Fig. la—e). 
Instead, tryptophan-2,3-dioxygenase (TDO), which is predominantly 
expressed in the liver (Supplementary Fig. 3a, b) and is believed to regu- 
late systemic Trp concentrations®, was strongly expressed in human 
glioma cells (Supplementary Fig. 1b) and correlated with Kyn release 
(Fig. 1b). Pharmacological inhibition or knockdown of TDO blocked 
Kyn release by glioma cells, whereas knockdown of IDO1 and IDO2 
had no effect (Fig. 1c,d and Supplementary Fig. 2a), thus confirming 
that TDO is the central Trp-degrading enzyme in human glioma cells. In 
human brain tumour specimens, TDO protein levels increased with 
malignancy and correlated with the proliferation index (Fig. le-g and 
Supplementary Fig. 2b-d). As described previously*, healthy human 
brain showed only weak TDO staining in neurons (Fig. le). TDO 


expression was not confined to gliomas but was also detected in other 
cancer types (Supplementary Fig. 3b, c). Lower Trp concentrations were 
measured in the serum of patients with glioma (Fig. 1h). These may not 
have translated into increased Kyn levels (Fig. 1h) because Kyn is taken 
up by other cells and metabolized to quinolinic acid. Indeed, accumula- 
tion of quinolinic acid was detected in TDO-expressing glioblastoma 
tissue (Fig. li and Supplementary Fig. 3d). 


Effects of TDO-mediated Kyn release on immune cells 


Kyn suppresses allogeneic T-cell proliferation’. Allogeneic T-cell pro- 
liferation was inversely correlated with Kyn formation by glioma- 
derived TDO (Fig. 2a and Supplementary Fig. 4a, b). Knockdown of 
TDO in glioma cells (Supplementary Fig. 4c, d) restored allogeneic 
T-cell proliferation, and the addition of Kyn to the TDO knockdown 
cells prevented the restoration of T-cell proliferation (Fig. 2b). The 
proliferation of CD4* and CD8™ T cells stimulated by the T-cell 
receptor was inhibited by Kyn in a concentration-dependent manner 
(Supplementary Fig. 4e). In addition, knockdown of TDO resulted in 
enhanced lysis of glioma cells by alloreactive peripheral blood mono- 
nuclear cells (Supplementary Fig. 4f). Finally, decreased infiltration 
with leukocyte common antigen (LCA)-positive and CD8* immune 
cells was observed in sections of human glioma with high TDO 
expression in comparison with those with low TDO expression 
(Fig. 2c), indicating that Kyn formation by TDO may suppress 
antitumour immune responses. In vivo experiments in immuno- 
competent mice demonstrated that tumours expressing TDO grew 
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Figure 1 | TDO degrades Trp to Kyn in human brain tumours. a, Trp (left) 
and Kyn (right) content in the supernatants of human astrocytes (hAs), glioma 
cell lines and GICs (T323) cultured for 72 h and measured by high-performance 
liquid chromatography (HPLC) (n = 4). b, Correlation between TDO mRNA and 
Kyn release by human glioma cells measured by quantitative RT-PCR and HPLC 
(n = 4). c, Kyn concentrations in the supernatants of U87 glioma cells cultured for 
48 hin the presence of the TDO inhibitor 680C91 (black bars) or its solvent (white 
bars; n = 4, P = 0.005, 0.002 and 0.0009 for 1, 5 and 10 uM 680C91, respectively). 
d, Kyn release by glioma cells after knockdown of TDO (black bars; P = 0.000007, 
0.0007 and 0.00006, respectively), IDOI (dark grey bars) or IDO2 (light grey bars) 
by siRNA (n = 3). e, Upper panel: weak neuronal TDO expression in healthy 

brain tissue. Lower panel: TDO expression in glioblastoma (WHO grade IV). 

TDO staining is in red. Asterisk indicates necrosis; arrowheads indicate the border 


faster and had a higher proliferation index than TDO-deficient con- 
trol tumours (Fig. 2d and Supplementary Fig. 4g-i). TDO activity in 
tumours suppressed antitumour immune responses in vivo, as 
demonstrated by a decreased release of interferon-y by tumour- 
specific T cells and tumour cell lysis by spleen cells of mice bearing 
TDO-expressing tumours in comparison with mice bearing TDO- 
deficient tumours (Fig. 2e, f). 


Effects of TDO-mediated Kyn release on glioma cells 


We next assessed the autocrine effects of Kyn on glioma cells. 
Although no differences in cell cycle progression were detected 
between controls and glioma cells with TDO knockdown (Sup- 
plementary Fig. 5a), knockdown of TDO decreased motility and 
clonogenic survival (Fig. 2g,h and Supplementary Fig. 5b-d). This 
was mediated by Kyn because exogenous addition of Kyn restored 
motility and clonogenic survival in the absence of Trp (Fig. 2i,j and 
Supplementary Fig. 5e, f), suggesting that Kyn increases the motility of 
malignant glioma cells. In GICs, sphere formation was enhanced in 
response to Kyn (Supplementary Fig. 5g). Finally, tumour formation 
was impaired when TDO knockdown tumours were orthotopically 
implanted in the brains of nude mice, which are devoid of functional 
T cells (Fig. 2k and Supplementary Fig. 5h, i). To analyse whether 
TDO-mediated inhibition of antitumour natural killer (NK)-cell res- 
ponses, which are functional in nude mice, might account for the 
impaired formation of TDO knockdown tumours, we compared sub- 
cutaneous tumour growth in the presence or absence of NK cells. NK- 
cell depletion (Supplementary Fig. 5j) enhanced the growth of both 
control and TDO knockout tumours but did not restore the growth of 
TDO knockout tumours to that in controls (Fig. 21 and Supplementary 
Fig. 5k), suggesting that Kyn generated by constitutive TDO activity 
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to infiltrated brain tissue. Insets: single tumour cells (arrows) infiltrating the 
adjacent brain tissue. Magnifications: 40 (main panels), x 400 (upper inset) and 
X 100 (lower inset). f, Plot of TDO expression (H-score; see Supplementary 
Methods) in brain tumours of increasing malignancy (WHO grades II-IV; grade 
II, n = 18; grade III, n = 15; grade IV, n = 35). g, Correlation of the Ki-67 
proliferative index with the TDO H-score in gliomas of different WHO grades 
(n = 42). h, Trp (left) and Kyn (right) concentrations in the sera of 24 patients 
with glioblastoma and 24 age-matched and sex-matched healthy controls, 
measured by HPLC. i, Quantification of staining with quinolinic acid in healthy 
human brain tissue (white bar; n = 5) and glioblastoma tissue (black bar; n = 5). 
The data distribution in f and g is presented as box plots, showing the 25th and 
75th centiles together with the median; whiskers represent the 10th and 90th 
centiles, respectively. Error bars indicate s.e.m. 


enhances the malignant phenotype of human gliomas in an autocrine 
manner in the absence of functional antitumour T-cell and NK-cell 
responses. 


Kyn activates the aryl hydrocarbon receptor 


For a better understanding of the molecular mechanisms underlying 
the autocrine effects of Kyn on glioma cells, we performed microarray 
analyses of Kyn-treated glioma cells revealing broad induction of aryl 
hydrocarbon receptor (AHR) response genes by Kyn (Fig. 3a and 
Supplementary Figs 6a,b and 7). Pathway analyses showed that the 
25 genes that were most strongly induced by Kyn treatment in U87 
cells at 8h and at 24h were all directly or indirectly regulated by the 
AHR (Fig. 3a and Supplementary Fig. 6b). The AHR is a transcription 
factor of the basic helix-loop-helix (bHLH) Per-Arnt-Sim (PAS) 
family, which is activated by xenobiotics such as benzo[a]pyrene 
and 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD)”*. Malignant glioma 
cell lines and GICs express the AHR constitutively (Supplementary 
Fig. 6c)'', and upregulation of AHR target genes by Kyn was confirmed 
in two different glioma cell lines (Supplementary Fig. 6d, e). Kyn led to 
translocation of the AHR into the nucleus after 1h, thus showing an 
immediate effect of Kyn on the AHR (Fig. 3b,c and Supplementary 
Fig. 8a). In accordance with this, western blot analyses of Kyn- 
activated tumour cells showed decreased cytoplasmic localization 
paralleled by increased nuclear accumulation of the AHR comparable 
to that induced by TCDD (Fig. 3d). In the nucleus the AHR forms 
a heterodimer with the AHR nuclear translocator (ARNT) that 
interacts with the core-binding motif of the dioxin-responsive ele- 
ments (DRE) located in regulatory regions of AHR target genes’’. 
Kyn induced DRE-luciferase activity in glioma cells, with a con- 
centration giving half-maximal response (ECs9) of 36.6 UM (Fig. 3e, 
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Figure 2 | Paracrine and autocrine effects of TDO-mediated Kyn release by 
glioma cells. a, Correlation of the proliferation of peripheral blood 
mononuclear cells (PBMCs) cultured with allogeneic glioma cell lines with the 
Kyn release of the glioma cells (n = 3). b, Proliferation of PBMCs cultured with 
allogeneic TDO-expressing control U87 glioma cells (sh-c) in comparison with 
U87 glioma cells with a stable short hairpin RNA-mediated knockdown of TDO 
(sh-TDO), with or without 100 uM Kyn (black bars), in comparison with 
PBMCs alone with or without 100 1M Kyn (white bars; n = 3). 

c, Quantification of LCA* cells (left) and CD8* cells (right) stained in human 
glioma sections with low TDO expression (H-score < 150, white bar; n = 12 for 
LCA, n = 10 for CD8) and in human glioma sections with high TDO 
expression (H-score = 150, black bar; n = 17 for LCA, n = 10 for CD8). 

d, Growth of Tdo-deficient GL261 murine glioma cells stably transfected with 
Tdo (filled circles) or empty vector (open circles) injected subcutaneously into 
the flank of C57BL/6N mice was monitored using metric callipers (n = 6). 
Tumour weight (g) was calculated as 0.5 X length (cm) x width? (cm”). 

e, Interferon-y release by T cells of mice bearing subcutaneous Tdo-expressing 
tumours (black bar) in comparison with T cells of mice bearing Tdo-deficient 


Supplementary Fig. 8b). AHR activation was unique to Kyn in a panel 
of Trp catabolites (Supplementary Table 1). An ethoxyresorufin-O- 
deethylase (EROD) assay confirmed the induction of the functional 
AHR target gene CYPIA1, encoding cytochrome P450, family 1, sub- 
family A, polypeptide 1, with an ECso of 12.3 4M for Kyn (Sup- 
plementary Fig. 8c). Radioligand binding assays with mouse liver 
cytosol from Ahr-proficient and Ahr-deficient mice showed that Kyn 
binds to the AHR with an apparent Kq of roughly 4 uM (Fig. 3f). 

Activation of the AHR and upregulation of AHR-regulated gene 
expression in response to Kyn were inhibited by the AHR antagonist 
dimethoxyflavone or knockdown of AHR (Fig. 3g and Supplementary 
Fig. 8g-k), indicating that Kyn is a specific agonist of the AHR. The 
involvement of the same or similar AHR residues in the binding to 
Kyn, TCDD and 3-methylcholanthrene was confirmed by the fact 
that dimethoxyflavone inhibited the activation of the AHR by all three 
ligands (Supplementary Fig. 8g-i). 


The effects of TDO-derived Kyn are mediated by the AHR 


The endogenous production of Kyn in glioma cells was sufficient to 
activate the AHR, because knockdown of TDO decreased the expres- 
sion of AHR-regulated genes (Fig. 3h and Supplementary Fig. 81-0). 


tumours (white bar) after re-stimulation with glioma lysates measured by 
ELISpot (n = 3). f, Lysis of GL261 murine glioma cells by spleen cells of mice 
with Tdo-expressing GL261 tumours in comparison with those with 
subcutaneous Tdo-deficient GL261 tumours, measured by chromium release 
(n = 4). g, Quantification of the migrated distances of sh-c (open squares) and 
sh-TDO (filled circles) cells into a collagen matrix (n = 3, P = 0.004, 0.0005 and 
0.01 for 24, 48 and 72 h, respectively). h, Clonogenic survival of sh-c (white bar) 
and sh-TDO (black bar) U87 cells (n = 3). i, Matrigel Boyden chamber assay 
of U87 glioma cells in the absence or presence of 70 [1M Trp without or 

with 30 or 60 UM Kyn (n = 3). j, Clonogenic survival of LN-18 glioma cells in 
the absence or presence of 70 uM Trp without or with 30 or 60 UM Kyn (n = 3). 
k, Representative cranial MRIs, haematoxylin/eosin staining (H&E) and nestin 
staining of CD1 nu/nu mice implanted with sh-c (upper panels) or sh-TDO 
(lower panels) U87 glioma cells. The images are representative of two 
independent experiments (n = 6). 1, Tumour weight of sh-c (white bars) and 
sh-TDO (black bars) U87 glioma cells injected subcutaneously in the flank of 
CD1 nu/nu mice that were treated either with anti-asialo GM1 antibody (asialo) 
for NK-cell depletion or control IgG (IgG) (n = 8). Error bars indicate s.e.m. 


Because mean Kyn concentrations of 37.01 + 13.4 uM were measured 
in U87 xenografts (n = 6), sufficient Kyn concentrations to activate 
the AHR were also reached in vivo. 

In accordance with activation of the AHR by TDO-derived Kyn, 
expression of the AHR target gene TIPARP in LCA* immune cells 
was observed only in human glioma sections expressing TDO 
(Fig. 4a). To determine whether TDO influences antitumour immune 
responses through the AHR we analysed the infiltration of immune 
cells in human glioma sections in relation to their AHR expression. 
Indeed, infiltration by LCA* and CD8* immune cells was decreased 
in sections of human gliomas with high AHR expression compared 
with those with low AHR expression (Fig. 4b). To analyse the contri- 
bution of host AHR expression to tumour growth, we compared the 
growth of murine tumours with and without Tdo expression in Ahr- 
deficient and Ahr-proficient mice. The growth of Tdo-expressing 
tumours was attenuated in Ahr-deficient mice when compared with 
Ahr-proficient mice (Fig. 4c) indicating that AHR-mediated host 
effects enhance tumour growth. Staining of LCA’ immune cells in 
the tumours revealed that expression of TDO decreased the infiltra- 
tion with LCA* immune cells in Ahr-proficient mice but not in Ahr- 
deficient mice (Fig. 4d and Supplementary Fig. 9a), suggesting that 
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Figure 3 | Kyn activates the AHR. a, Connection of the 25 genes that were 
most strongly induced by Kyn treatment in U87 cells after 8h to AHR signalling 
(red, upregulation; green, downregulation). b, Translocation of green 
fluorescent protein (GFP)-tagged AHR into the nucleus of mouse hepatoma 
cells, which do not degrade Trp, after 3 h treatment with 50 UM Kyn, 50 uM Trp 
or 1 nM TCDD (negative control: medium). ¢c, Ratios of nuclear to cytoplasmic 
fluorescent intensity in cells with GFP-tagged AHR after 3h of indicated 
treatment (negative control: medium; positive control: 1 nM TCDD, 50 uM 
Kyn). The data distribution is represented by box plots, showing the 25th and 
75th centiles together with the median; whiskers represent the 10th and 90th 
centiles, respectively (P < 0.001, one-way analysis of variance on ranks, followed 
by Dunn’s method). d, AHR western blots of two different nuclear (N) and 
cytoplasmic (C) fractions each of control (lanes 1 and 2), Kyn-treated (lanes 3 


TDO-mediated suppression of antitumour immune responses 
through the AHR contributes to the host effects enhancing the growth 
of Tdo-expressing tumours. In addition, while in Ahr-proficient 
mice the expression of Tdo strongly enhanced tumour growth in 
comparison with tumours not expressing Tdo, the same effect was 
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and 4) and TCDD-treated (lanes 5 and 6) human LN-229 glioma cells. e, Dioxin- 
responsive element (DRE) chemical activated luciferase gene expression in U87 
glioma cells treated with the indicated Kyn concentrations (n = 2). 

f, Radioligand binding assay with indicated concentrations of L-[*H]Kyn using 
mouse liver cytosol from Ahr-proficient and Ahr-deficient mice. Specific 
binding was calculated by subtracting the radioactivity measured in Ahr- 
deficient cytosol from that of Ahr-proficient cytosol (n = 4). g, CYPIAI mRNA 
expression in sh-AHR LN-308 glioma cells (black bars) in comparison with 
controls (sh-c; white bars) treated with 100 uM Kyn, 1nM TCDD or controls 
(n = 4). h, mRNA expression of AHR target genes in sh-TDO (black bars) in 
comparison with sh-c U87 glioma cells (white bars; n = 4) (AHRR, P = 0.02; 
CYPIAI1, P = 0.0007; IL1A, P = 0.001; IL1B, P = 0.0000006; IL6, P = 0.0047; 
IL8, P = 0.01; PAI-2, P = 0.0005; TIPARP, P = 0.06). Error bars indicate s.e.m. 


observed in Ahr-deficient mice, although to a much smaller extent 
(Fig. 4c). Because murine glioma cells express functional AHR (Sup- 
plementary Fig. 9b-e), these results suggest that the increase in 
tumour growth mediated by TDO in Ahr-deficient mice is due to 
autocrine effects of TDO on the tumour cells themselves. 

This notion is supported by the fact that Kyn failed to induce 
motility of human glioma cells after AHR knockdown (Fig. 4e). In 
addition, the increase in clonogenic survival in response to Kyn was 
abolished in glioma cells with a knockdown of AHR (Fig. 4f). Finally, 


Figure 4 | The autocrine and paracrine effects of TDO-derived Kyn are 
mediated through the AHR. a, Immunofluorescence staining of LCA and 
TIPARP in human glioma sections with low or high TDO expression. 
Magnification < 400. b, Quantification of LCA® cells (left) and CD8* cells 
(right) stained in human glioma sections with low AHR expression 
(Histoscore < 150, white bar; n = 10 for LCA, n = 8 for CD8) and in human 
glioma sections with high AHR expression (H-score = 150, black bar; n = 12 
for LCA, m = 12 for CD8). c, Tumour weight measured 15 days after 
subcutaneous injection of murine GL261 glioma cells with and without Tdo 
expression in the flanks of Ahr-proficient mice (white bars) or Ahr-deficient 
mice (black bars; n = 6). d, Quantification of LCA* immune cells stained in the 
subcutaneous Tdo-proficient and Tdo-deficient GL261 tumours in Ahr- 
proficient and Ahr-deficient mice presented as box plots, showing the 25th and 
75th centiles and the median (n = 4). WT, wild type. e, Migration of sh-c LN- 
308 glioma cells (white bars) and LN-308 glioma cells with knockdown of the 
ARR by two different shRNAs (grey bars, sh-AHR1; black bars, sh-AHR2) in 
the presence or absence of 100 UM Kyn (n = 4). f, Clonogenicity of sh-c (white 
bars) and sh-AHR (black bars) LN-308 glioma cells with or without 100 uM 
Kyn (n = 3). g, Growth of AHR-proficient (filled circles) and AHR-deficient 
(open circles) human LN-308 glioma cells injected subcutaneously into the 
flank of CD1nu/nu mice was monitored with metric callipers (n = 7). Tumour 
weight was calculated as in Fig. 2d. P values: day 5, 0.147; day 9, 0.546; day 12, 
0.027; day 16, 0.008; day 19, 7.18 X 10~*; day 22, 1.77 X 10° *; day 26, 

1.57 X 107; day 30, 8.49 X 10-4; day 34, 8.26 X 104; day 41, 0.022; day 45, 
0.0477. Error bars indicate s.e.m. 
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in vivo experiments demonstrated that induced knockdown of 
AHR in human glioma cells inhibited tumour growth in immuno- 
compromised mice (Fig. 4g and Supplementary Fig. 9f), underscoring 
the importance of AHR signalling for the autocrine effects of Trp 
degradation. 


TDO-derived Kyn activates the AHR in human cancer 

Next we aimed at investigating whether TDO-derived Kyn activates the 
AHR in human brain tumour tissue. Indeed, TDO expression correlated 
with the expression of the AHR and AHR target genes in human glioma 
tissue (Fig. 5a—c and Supplementary Fig. 9g), indicating that constitutive 
TDO expression in glioma cells produced sufficient Kyn levels to activate 
the AHR. To address whether the TDO-Kyn-AHR signalling pathway 
is also activated in cancers other than glioma, we analysed microarray 
data of diverse human tumour entities. Interestingly, TDO expression 
correlated with the expression of the AHR target gene CYP1B1 not only 
in glioma (Fig. 5c) but also in B-cell lymphoma, Ewing sarcoma, bladder 
carcinoma, cervix carcinoma, colorectal carcinoma, lung carcinoma 
and ovarian carcinoma (Fig. 5d, Supplementary Fig. 10a and 
Supplementary Table 2). This finding indicates that the TDO-Kyn- 
AHR pathway is not confined to brain tumours but seems to be a 
common trait of cancers. Analysis of the Rembrandt database revealed 
that the overall survival of patients with glioma (WHO grades II-IV) 
with high expression of TDO, the AHR or the AHR target gene CYP1B1 
was reduced in comparison with patients with intermediate or low 
expression of these genes (Fig. 5e and Supplementary Fig. 10b)’’. 
Finally, in patients with glioblastoma (WHO grade IV)", the expres- 
sion of the AHR targets CYP1B1, IL1B, IL6 and IL8, which are regu- 
lated by TDO-derived Kyn in glioma cells (Fig. 3h and Supplementary 
Fig. 5d, e), were found to predict survival independently of WHO grade 
(Fig. 5f and Supplementary Fig. 10c), thus further underscoring the 
importance of AHR activation for the malignant phenotype of gliomas. 
In summary, these data suggest that endogenous tumour-derived Kyn 
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activates the AHR in an autocrine/paracrine fashion to promote 
tumour progression (Fig. 5g). 


Discussion 

Cancer-associated immunosuppression by Trp degradation has 
until now been attributed solely to the enzymatic activity of IDO in 
cancer cells and tumour-draining lymph nodes. IDO inhibition is 
therefore currently being evaluated as a therapeutic strategy in the 
treatment of cancer in clinical trials’ despite some off-target effects on 
human cancer cells. We show that TDO is strongly expressed in 
cancer and is equally capable of producing immunosuppressive 
Kyn. In IDO-negative glioma cells, TDO seems to be the sole deter- 
minant of constitutive Trp degradation, indicating that TDO repre- 
sents a novel therapeutic target in glioma therapy. In fact, an orally 
available TDO inhibitor has recently been developed’®. Inhibition of 
TDO may not only restore antitumour immune responses but also 
act on the tumour cell intrinsic malignant phenotype, because we 
delineated the importance of constitutive Trp degradation to sustain 
the malignant phenotype of cancer by acting on the tumour cells 
themselves. 

Emerging evidence indicates a tumour-promoting role of the AHR. 
AHR activation promotes clonogenicity and invasiveness of cancer 
cells''!”. Transgenic mice with a constitutively active AHR sponta- 
neously develop tumours'*””, and the repressor of the AHR (AHRR) 
represents a tumour suppressor in multiple human cancers”. The 
aberrant phenotype of Ahr-deficient mice points to the existence of 
endogenous AHR ligands’. Although different endogenously pro- 
duced metabolites such as arachidonic acid metabolites, bilirubin, 
cyclic AMP, tryptamine and 6-formylindolo[3,2-b]carbazole (FICZ) 
have been shown to be agonists of the AHR”, their functionality has 
not been convincingly demonstrated in a pathophysiological context 
such as cancer or immune activation. The search for endogenous 
ligands of the AHR is therefore continuing. 
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Figure 5 | TDO-derived Kyn activates the AHR in diverse human cancers, 
and AHR activation predicts survival in patients with glioma. a, Correlation 
of TDO expression (red) and AHR expression (brown) in consecutive sections 
of human glioblastoma tissue. Arrows indicate vessels for orientation. 
Magnification 40; insets X200. b, Correlation between TDO and AHR 
expression in human glioma tissue based on H-scores of TDO and AHR, 
calculated by Spearman rank correlation (n = 26). ¢, Correlation between TDO 
and CYP1B1 expression in microarray data of human glioblastoma (n = 396) 
analysed by Spearman rank correlation. d, Correlation between TDO and 
CYP1B1 expression in microarray data of human bladder cancer (left; n = 58), 
human lung cancer (centre; n = 122) and human ovarian carcinoma (right; 
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n= 91) analysed by Spearman rank correlation. e, Survival probabilities of 
patients with glioma (WHO grades II-IV) with high expression (red) of TDO 
(left) or AHR (right) compared with those of patients with intermediate (blue) 
or low (green) expression of these genes derived from Rembrandt’. f, Survival 
probabilities of patients with glioblastoma with high expression (red) of the 
AHR target gene CYP1B1 compared with those of patients with low (green) 
expression of CYP1B1 derived from the glioblastoma data set of the Cancer 
Genome Atlas network"* (n = 362). g, Synoptic figure highlighting the 
autocrine and paracrine effects of TDO-derived Kyn on cancer cells and 
immune cells through the AHR. 
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We now link these two important pathways contributing to cancer 
progression by showing that Trp catabolism leads to AHR activation, 
and we provide evidence of a pathophysiological human condition 
that is associated with the production of sufficient amounts of a func- 
tionally relevant endogenous AHR ligand. Our results reveal a differ- 
ential response of primary immune cells and transformed cancer cells 
to AHR-mediated signals, which is in line with various toxicological 
studies using the classical exogenous AHR ligands TCDD and 
3-methylcholanthrene’?’”**. Exposure to these xenobiotics leads to 
profound suppression of cellular and humoral immune responses”, 
while also promoting carcinogenesis and inducing tumour 
growth'’’. These cell-specific differences in AHR effects are likely 
to depend on the expression of factors differentially regulating AHR 
signal transduction such as the AHRR” as well as cell-specific tran- 
scription factor crosstalk shaping the response to AHR activation”. 

It is likely that Kyn-mediated activation of the AHR is not only 
relevant in the setting of cancer. For instance, activation of the mouse 
and human AHR by agonistic ligands induces regulatory T cells*”*. 
Mice with a poor-affinity AHR suffer from exacerbated autoimmune 
encephalomyelitis in the absence of an exogenous ligand’’, and Trp 
catabolites suppress autoimmune neuroinflammation””, suggesting 
that activation of Trp catabolism represents an endogenous feedback 
loop to restrict inflammation through the AHR. In fact, exogenous 
Kyn is involved in the regulation of immune cells in mice through 
the AHR*'”’. Kyn concentrations sufficient to activate the AHR are 
also generated by IDO in response to inflammatory stimuli (Sup- 
plementary Fig. 1la-c). In a broader context, a significant number 
of malignancies arise from areas of mostly chronic infection and 
inflammation®’, where Trp catabolism in the tumour microenviron- 
ment is activated and sustains local immune suppression™. Activation 
of the AHR by Kyn generated in response to inflammatory stimuli 
may thus constitute a previously unrecognized pathway connecting 
inflammation and carcinogenesis. 


METHODS SUMMARY 


TDO expression was analysed by immunohistochemistry in human tumours. Its 
relevance for Trp degradation was determined by genetic knockdown or over- 
expression of TDO. Trp and Kyn were measured in cell culture supernatants, 
human sera and xenograft tissue by high-performance liquid chromatography. 
Mixed leukocyte reactions, chromium release, ELISpot and staining of immune 
cells in tumour tissues were used to assess the immune effects of TDO activity. 
Cell cycle analysis, Matrigel and spheroid invasion assays, scratch assays, sphere 
formation assays and clonogenicity assays were employed to analyse the auto- 
crine effects of TDO activity. All animal procedures followed the institutional 
laboratory animal research guidelines and were approved by the governmental 
authorities. Orthotopic implantation of human glioma cells with and without 
stable knockdown of TDO into CDInu/nu mice, subcutaneous injection of these 
cells into NK-depleted or wild-type CD1nu/nu mice and subcutaneous injection 
of murine Tdo-proficient and Tdo-deficient GL261 cells into syngeneic C57BL/ 
6N mice were performed to analyse the autocrine and paracrine effects of TDO 
activity in vivo. Microarray analysis of Kyn-treated human glioma cells was 
performed to identify signalling pathways activated by Kyn. Analysis of AHR 
translocation, DRE-luciferase assays and radioligand binding assays confirmed 
activation of the AHR by Kyn. Pharmacological inhibition and stable knockdown 
of the AHR (in vitro and in vivo) proved that the effects of Kyn are AHR depend- 
ent. Injection of Tdo-proficient and Tdo-deficient tumour cells into Ahr*/* and 
Ahr~'~ mice was used to address the contribution of host effects to TDO- 
mediated cancer promotion. Finally, stainings, microarray data and clinical data 
on human tumour tissues were used to analyse whether TDO activates the AHR 
in human cancers and how this affects survival. 
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Inhibition of BET recruitment to chromatin as an 
effective treatment for MLL-fusion leukaemia 
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Recurrent chromosomal translocations involving the mixed lineage 
leukaemia (MLL) gene initiate aggressive forms of leukaemia, which 
are often refractory to conventional therapies’. Many MLL-fusion 
partners are members of the super elongation complex (SEC), a 
critical regulator of transcriptional elongation, suggesting that 
aberrant control of this process has an important role in leukaemia 
induction”*. Here we use a global proteomic strategy to demon- 
strate that MLL fusions, as part of SEC’ and the polymerase- 
associated factor complex (PAFc)**, are associated with the BET 
family of acetyl-lysine recognizing, chromatin ‘adaptor’ proteins. 
These data provided the basis for therapeutic intervention in MLL- 
fusion leukaemia, via the displacement of the BET family of proteins 
from chromatin. We show that a novel small molecule inhibitor of 
the BET family, GSK1210151A (I-BET151), has profound efficacy 
against human and murine MLL-fusion leukaemic cell lines, 
through the induction of early cell cycle arrest and apoptosis. 
I-BET151 treatment in two human leukaemia cell lines with differ- 
ent MLL fusions alters the expression of a common set of genes 
whose function may account for these phenotypic changes. The 
mode of action of I-BET151 is, at least in part, due to the inhibition 
of transcription at key genes (BCL2, C-MYC and CDK6) through 
the displacement of BRD3/4, PAFc and SEC components from 
chromatin. In vivo studies indicate that I-BET151 has significant 
therapeutic value, providing survival benefit in two distinct mouse 
models of murine MLL-AF9 and human MLL-AF4 leukaemia. 
Finally, the efficacy of I-BET151 against human leukaemia stem 
cells is demonstrated, providing further evidence of its potent thera- 
peutic potential. These findings establish the displacement of BET 
proteins from chromatin as a promising epigenetic therapy for 
these aggressive leukaemias. 

Dysregulation of chromatin modifiers is a recurrent and sentinel 
event in oncogenesis®. Therapeutic strategies that selectively alter the 
recruitment and/or catalytic activity of these enzymes at chromatin 
therefore hold great promise as targeted therapies®. In this regard the 
bromodomain and extra terminal (BET) family of proteins (BRD2, 
BRD3, BRD4 and BRDT) provide an ideal “druggable’ target, because 
they share a common highly conserved tandem bromodomain at their 
amino terminus. Selective bromodomain inhibitors that disrupt the 
binding of BET proteins to histones have recently been described’*; 
however, their true therapeutic scope remains untested. 

To identify the nuclear complexes associated with ubiquitously 
expressed BETs (BRD2/3/4), we performed a systematic global pro- 
teomic survey. Specifically, this involved a tri-partite discovery 
approach (Fig. 1a). In the first approach, bead-immobilized analogues 


of I-BET762 (ref. 9) were incubated with HL60 nuclear extracts and 
bound proteins were analysed by quantitative mass spectrometry 
(Supplementary Table 1). This approach identified the BET isoforms 
and a large number of co-purifying proteins (Supplementary Tables 1 
and 2), indicating that the BET isoforms reside in many distinct protein 
complexes. In the second approach, immunoprecipitation analyses 
with selective antibodies against BRD2/3/4 were performed (Sup- 
plementary Fig. 1 and Supplementary Tables 3 and 4). This was com- 
plemented with additional immunoprecipitations using selected 
antibodies against complex members (‘baits’) selected from the subset 
of proteins that were identified in the first approach (Fig. 1b right 
panel, Supplementary Fig. 2 and Supplementary Table 3). In the third 
approach, bead-immobilized histone H4(1-21; K5acK8acK12ac) 
acetylated peptides were used to purify protein complexes. These data 
were combined to highlight a list of complexes identified in all three 
methods (Fig. 1b left panel, Supplementary Fig. 3 and Supplementary 
Table 1). Finally, specificity of the I-BET762 and histone tail matrix 
was further assessed by competition experiments (Fig. 1c, Supplemen- 
tary Figs 4, 5 and Supplementary Table 2). This strategy enabled 
the direct determination of the targets of the inhibitor, and the proteins 
associated with the target, with subunits of protein complexes 
exhibiting closely matching half-maximum inhibitory concentration 
(ICs) values’®. Taken together these stringent and complementary 
approaches provide a high confidence global data set encompassing 
all known''”? and several novel BET protein complexes (Fig. 1b and 
Supplementary Fig. 3). Among the novel complexes, we observed a 
prominent enrichment and dose-dependent inhibition of several com- 
ponents of the PAFc** and SEC”® (Fig. 1b, c), which were confirmed by 
reciprocal immunoprecipitations in HL60 cells (Fig. 1b). Moreover, 
reciprocal immunoprecipitations in two MLL-fusion leukaemia cell 
lines (MV4;11 and RS4;11) confirmed the relationship of SEC with 
BRD4 in different cellular contexts (Fig. 1d). Together these data indi- 
cate that BRD3 and BRD4 associate with the PAFc and SEC and may 
function to recruit these complexes to chromatin. Given that these 
complexes are crucial for malignant transformation by MLL fusions” ° 
we tested the hypothesis that displacement of BET proteins from chro- 
matin may have a therapeutic role in these leukaemias. 

To progress our studies with an optimized therapeutic agent we 
developed I-BET151 (Fig. le); a novel dimethylisoxazole template, 
previously undisclosed as a BET bromodomain inhibitor. It was iden- 
tified and optimized to retain excellent BET target potency (Fig. 1i) 
and selectivity (Fig. 1h, Supplementary Figs 5-10 and Supplementary 
Table 5) while enhancing the in vivo pharmacokinetics and terminal 
half-life to enable prolonged in vivo studies (Fig. 4a and Supplementary 
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Figure 1 | A global proteomic survey identifies BET proteins as part of the 
PAFc and SEC. a, Proteomic strategy. b, Left, Cytoscape representation of the 
BET protein complex network (discussed in detail in Supplementary Fig. 3). 
Bold circles indicate associations confirmed by the three orthogonal methods. 
Right, heat map representing quantitative-mass spectrometry data following 
co-immunoprecipitation of BET's, PAF and SEC complex members. 

c, Differential proteomic analysis of the proteins interacting with I-BET and 
triple acetylated histone H4 tail. Left, affinity matrices with immobilized 
I-BET762 or histone H4(K5acK8acK12ac) peptide bind to the same set of BET 
complexes. Protein abundance was determined from signal intensities in the 
mass spectrometer (arbitrary units, K = 1,000). Right, competitive inhibition 
of the binding of BET isoforms, and SEC and PAF complex components, to the 
I-BET762 matrix showing matching concentration dependence. d, BRD4 and 
MLLT1 interact in HL60, MV4;11 and RS4;11 cells and binding to the 
I-BET762 matrix is blocked by excess I-BET151. e, Chemical structure of 
GSK1210151A (I-BET151). f, I BET151 binding to the acetyl-binding pocket of 
BRD4-BD1 (cyan) overlaid with H3K14-acetyl peptide (green) (Protein 


Fig. 20). We also generated proteomic selectivity profiles comparing 
I-BET151 with I-BET762 (Fig. 1h, Supplementary Fig. 5 and Sup- 
plementary Table 6). We bead-immobilized a combination of differ- 
entially acetylated histone tail peptides (Supplementary Table 7), 
which captured a total of 27 bromodomain proteins from HL60 nuc- 
lear extracts. Competition with excess I-BET151 or I-BET762 blocked 
the capture of BRD2, BRD3, BRD4, and BRD9 but had no effect on the 
23 other bromodomain proteins including MLL. The inhibition of 
BRD9 is likely to be indirect as this protein forms a complex with 
BRD4 (Supplementary Table 3). Finally, a high-resolution (1.5 A) 
crystal structure of I-BET151 bound to BRD4-bromodomain 1 
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Database ID 3jvk). A surface representation of the BRD4-BD1 is shown with 
key recognition and the specificity WPF shelf identified. g, Ribbon 
representation of the BRD4-BD1 (cyan) crystal structure complexed with 
I-BET151 (shown in magenta stick format) overlaid with H3(12-19)K14ac 
peptide (green) taken from its complex with BRD4-BD1(PDB ID 3jvk). 
Secondary elements of the BRD4-BD1 structure have been highlighted. 

h, Selectivity profile of IBET-151 showing average temperature shifts (T,,) 
using a fluorescent thermal shift assay. Numbering inside the spheres indicates 
bromodomains assessed; for example, 12 signifies both bromodomains 1 and 2 
have been assessed. Overlaid is the selectivity profile generated using a 
proteomic approach (shown as boxes around proteins, discussed in 
Supplementary Fig. 5). Where the bromodomains have been profiled by both 
thermal shift and proteomic approaches the agreement is excellent. Proteins 
not assessed by either technique are shown in grey. i, Comparison of I-BET762 
and I-BET151 potency in ligand displacement assays, direct Biacore binding 
and lipopolysaccharide-stimulated IL-6 cytokine production from human 
peripheral blood mononuclear cells (PBMC) or whole blood (WB). 


(BD1) revealed binding to the acetylated-lysine (AcK) recognition 
pocket of the BET protein (Fig. 1f, g and Supplementary Fig. 10). 

To assess the therapeutic efficacy and selectivity of I-BET151, we 
tested a panel of leukaemic cell lines harbouring a spectrum of distinct 
oncogenic drivers. These data demonstrated that I-BET151 has potent 
efficacy against cell lines harbouring different MLL-fusions (Fig. 2a 
and Supplementary Fig. 11). To extend these data we tested the 
clonogenic potential of human leukaemic cells grown in cytokine- 
supplemented methylcellulose containing dimethylsulphoxide 
(DMSO; vehicle) or I-BET151. Consistent with the profound effects 
in liquid culture, the colony-forming potential of MLL-fusion-driven 
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Figure 2 | I-BET151 selectively and potently inhibits MLL-fusion leukaemic 
cell lines in vitro. a, Human leukaemia cell lines tested using I-BET151. 

b, Clonogenic assays performed in the presence of DMSO or I-BET151. 

c, Haematopoietic progenitors were isolated from mouse bone marrow and 
retrovirally transformed with MLL-ENL or MLL-AF9. These cells were used in 
both proliferation and clonogenic assays d, Apoptosis was assessed by FACS 


leukaemias (MOLM13) was completely ablated by I-BET151, whereas 
leukaemias driven by tyrosine kinase activation (K562) were un- 
affected (Fig. 2b). In addition to the data with human leukaemic 
cell lines, we also confirmed the potent efficacy of I-BET151 in both 
liquid culture and clonogenic assays using primary murine progenitors 


analysis after 72 h incubation with DMSO or I-BET151. e, Cell cycle 
progression was assessed by FACS analysis 24 h after incubation with DMSO or 
I-BET151 (y axis event count, x axis arbitrary fluorescence units). Bar graphs 
are represented as the mean and error bars reflect standard deviation of results 
derived from triplicate experiments. 


retrovirally transformed with either MLL-ENL or MLL-AF9 (Fig. 2c). 

To investigate the mechanism of action for I-BET151, we performed 
fluorescence-activated cell sorting (FACS) analysis to assess apoptosis 
and cell cycle progression after I-BET151 treatment. Figure 2d-e and 
Supplementary Fig. 12 show a marked induction of apoptosis and a 
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Figure 3 | Transcriptome and ChIP analyses provide mechanistic insights 
for the efficacy of I-BET151. a, Volcano plots for DMSO against I-BET151 
treated samples, showing the adjusted significance P value (log) versus fold 
change (log,). b, Correlation of log, fold change between MV411 and 
MOLM131 across all genes. No genes show opposing expression changes. Lines 
represent the identity line (black solid), the line of best fit (black dotted), or log, 
fold-change threshold values (green dotted). c, Heat map of top 100 genes 


downregulated following treatment with I-BET151. d, BCL2 gene expression 
(normalized to B2M expression) is shown. Expression level of BCL2 in DMSO 
was assigned a value of 1. e, Immunoblotting demonstrating a decrease in BCL2 
and an increase in cleaved PARP (*) after I-BET151 treatment. f, ChIP analysis 
at the TSS and 3’ end of BCL2 is illustrated. Bar graphs are represented as the 
mean enrichment relative to input and error bars reflect standard deviation of 
results derived from biological triplicate experiments. 
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Figure 4 | I-BET151 is efficacious in in vivo murine models and primary 
patient samples of MLL-fusion leukaemia. a, Murine pharmacokinetic 
studies (mean + s.d., n = 4 per compound) comparing the blood concentration 
of I-BET151 with I-BET762 and JQ1. b, Kaplan-Meier curve of control and 
treated NOD-SCID mice transplanted with 1 x 10’ MV4;11 cells. Green 
arrowhead, treatment commencement on day 21. c, Haematoxylin and eosin- 
stained histological sections of the renal parenchyma of control and treated 
mice. Black arrows highlight leukaemic infiltration. d, Representative FACS 
analysis from the peripheral blood of control or I-BET151-treated mice. 

e, Kaplan-Meier curve of control and treated C57BL/6 mice transplanted with 
2.5 X 10° syngeneic MLL-AF9 leukaemic cells. Green arrowhead, treatment 
commencement on day 9. f, Photomicrograph of the spleen size from 5/8 
control and 1/12 I-BET151-treated mice that died on day 12. g, Haematoxylin 


prominent Go/G, arrest in two MLL-fusion cell lines driven by distinct 
MLL fusions (MOLM13 and MV4;11 containing MLL-AF9 and 
MLL-AF4, respectively). In contrast, the cell cycle characteristics 
and apoptotic rate of K562 cells were largely unaffected at this time. 
These data indicate that I-BET151 alters the transcriptional pro- 
grammes regulating apoptosis and cell-cycle progression in MLL- 
fusion leukaemias. 

To identify the precise transcriptional pathways controlled by 
I-BET151, global gene-expression analysis was performed in 
MOLM13 and MV4;11 cells after treatment with I-BET151 or 
DMSO for 6h. This strategy allowed us to identify early I-BET151- 
responsive genes, before any discernable phenotypic alteration in cell 
cycle or apoptosis (Supplementary Fig. 12). As demonstrated previ- 
ously’, we observed differential expression of a selective subset of genes 
(Fig. 3a), rather than global transcriptional dysregulation. Remarkably, 
the transcriptional programmes altered in the two MLL-fusion cell 
lines were highly correlated (Fig. 3b) and gene set enrichment analysis 
documented significant overlap with published MLL fusion signatures 
including MLL-fusion leukaemia stem cells (LSC)'*"> (Supplementary 
Fig. 13). These data are consistent with the notion that MLL fusions 
aberrantly co-opt the SEC and PAFc to regulate similar transcriptional 
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and eosin-stained histological sections of the liver parenchyma from control 
and I-BET151-treated mice demonstrating reduced disease burden in the 
treated animal. h-j, Peripheral blood white cell count (h), liver weight (i) and 
spleen weights (j) from all the control and treated mice at the time of necropsy. 
k, Representative FACS analysis assessing apoptosis from a patient with MLL- 
AF6 leukaemia. 1, Clonogenic assays with human MLL-fusion LSC isolated by 
FACS sorting (CD34*/CD38° ) and plated in the presence of DMSO or 
I-BET151. m, Gene expression changes in human MLL-fusion leukaemia cells 
following treatment with I-BET151 or DMSO. The log, fold change in the 
expression level for all genes (expression level with I-BET151 treatment/ 
expression level with DMSO) is represented. n, Schematic model proposing the 
mode of action for I-BET151 in MLL-fusion leukaemia. 


programmes. Notably, the top 100 genes concomitantly decreased in 
both MOLM13 and MV4;11 (Fig. 3c) contained several previously 
reported direct MLL targets, such as BCL2, CDK6 and MYC, the down- 
regulation of which was consistent with the phenotypic consequences 
of I-BET151 treatment. 

BCL2 is a key antiapoptotic gene implicated in the pathogenesis of 
MLL-fusion leukaemias!®!’. Consistent with these data, I-BET151 
reduced the expression of BCL2 in a third MLL-fusion cell line 
(NOMO1) but not in the unresponsive K562 cells (Fig. 3d), and induc- 
tion of apoptosis coincided with a marked reduction in BCL2 protein 
expression (Fig. 3e). Moreover, overexpression of BCL2 in the pres- 
ence of I-BET151 rescued the apoptotic phenotype (Supplementary 
Fig. 14). Chromatin immunoprecipitation (ChIP) analyses at the BCL2 
locus showed that 6 h of I-BET151 treatment selectively decreased the 
recruitment of BRD3/4 and impaired recruitment of CDK9 and PAF1 
(part of SEC and PAFc, respectively) to the transcriptional start site 
(TSS). This correlated with reduced phosphorylation of RNA poly- 
merase II (Pol II) on serine 2 of its carboxy-terminal domain (Pol- 
IIS2ph) (Fig. 3f). A similar pattern was observed at two other MLL 
target genes (MYC and CDK6), but not at housekeeping genes (B2M) 
whose expression was unaltered by I-BET151 (Supplementary Fig. 15). 
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Together, these data indicate that the mechanism of efficacy for 
I-BET151 involves a selective abrogation of BRD3/4 recruitment to 
chromatin. The consequence of this is the inefficient phosphorylation/ 
recruitment of Pol II. Further investigation is necessary to distinguish 
whether Pol II recruitment and/or elongation is primarily affected by 
I-BET151. 

We next sought to establish the therapeutic potential of I BET151 in 
vivo. We first characterized the pharmacokinetic properties of 
I-BET151 in several preclinical species (Supplementary Fig. 20) and 
also compared it to published inhibitors”* (Fig. 4a). We then assessed 
the efficacy of I-BET151 in two established models of MLL leukaemia. 
Our first model was a xenotransplant model of disseminated human 
MLL-AF4 leukaemia'®. I-BET151 was delivered daily at 30 mg kg‘ by 
intraperitoneal injection from day 21 (ref. 18), and mice were humanely 
killed if clinical disease dictated or if there was a sequential rise in 
peripheral blood disease. At the experimental end-point all the control 
mice had succumbed to fulminant or progressive disease whereas only 
one out of five mice in the treated cohort had evidence of disease at low 
levels (Fig. 4b-d and Supplementary Fig. 16). In our second syngeneic 
model of murine MLL-AF9 leukaemia, 2.5 X 10° leukaemic cells, estab- 
lished from serial transplantation, were injected into tertiary recipients. 
Despite the latency being reduced to less than 15 days, we waited to 
initiate treatment from day 9 to test the efficacy of I-BET151 in the 
setting of overwhelming established disease (Fig. 4e), the scenario often 
encountered in clinical practice. Even here I-BET151 provided a clear 
and marked survival benefit (Fig. 4e-j and Supplementary Fig. 17). 
Taken together, these data demonstrate that I-BET151 provides excel- 
lent control of MLL leukaemia progression in two distinct and comple- 
mentary murine models. 

Finally, to demonstrate the applicability of our findings to human 
disease, we tested the efficacy of I-BET151 in leukaemia cells isolated 
from patients with various MLL fusions. These data show that 
I-BET151 accelerates apoptosis (Fig. 4k and Supplementary Fig. 18), 
and abrogates clonogenic efficiency in bulk leukaemia (Supplementary 
Fig. 19) as well as isolated LSC (Fig. 41). These effects are driven, at least 
in part, by downregulation of a similar transcription programme iden- 
tified in MLL-fusion cell lines (Fig. 4m). Taken together, these data 
provide compelling evidence of therapeutic potential and suggest that 
disease eradication is possible. 

The paradigm for epigenetic drug discovery shown here highlights an 
emerging role for targeting aberrant transcriptional elongation in onco- 
genesis” ° and provides the first example in epigenetic therapy where 
mechanistic insights have driven targeted drug discovery and application 
(Fig. 4n). Together, our results suggest that perturbing the interaction of 
BET proteins with chromatin using I-BET151 may be of great thera- 
peutic value in human MLL-fusion leukaemias. Using a complementary 
strategy and a different BET inhibitor, a separate study published in this 
issue concurs with this view'’. Moreover, the extensive proteomic 
resource provided here has identified other important disease-associated 
proteins binding to BET proteins, such as MMSET (WHSC1), which is 
implicated in multiple myeloma”’. This raises the possibility that BET 
inhibitors may have an even wider therapeutic scope in oncology and 
perhaps in other areas of unmet need within the clinical arena. 


METHODS SUMMARY 


Cell culture, gene expression, chromatin immunoprecipitation and FACS analysis 
were performed as previously described”’. Proteomic profiling and characterization 
of inhibitor specificity was performed using methodology previously described”””. 
Detailed information about the reagents and methodology used in this study is 
available in Supplementary Information. 
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Ocean-like water in the Jupiter-family comet 


103P/Hartley 2 


Paul Hartogh', Dariusz C. Lis’, Dominique Bockelée-Morvan?, Miguel de Val-Borrol, Nicolas Biver®, Michael Ktippers*, 
Martin Emprechtinger’, Edwin A. Bergin’, Jacques Crovisier*, Miriam Rengel', Raphael Moreno’, Slawomira Szutowicz° 


& Geoffrey A. Blake? 


For decades, the source of Earth’s volatiles, especially water with a 
deuterium-to-hydrogen ratio (D/H) of (1.558 + 0.001) x 10“, has 
been a subject of debate. The similarity of Earth’s bulk composition 
to that of meteorites known as enstatite chondrites’ suggests a dry 
proto-Earth’ with subsequent delivery of volatiles’ by local accre- 
tion* or impacts of asteroids or comets*®. Previous measurements 
in six comets from the Oort cloud yielded a mean D/H ratio of 
(2.96 + 0.25) x 107“. The D/H value in carbonaceous chondrites, 
(1.4+0.1) x 1074, together with dynamical simulations, led to 
models in which asteroids were the main source of Earth’s water’, 
with =10 per cent being delivered by comets. Here we report that 
the D/H ratio in the Jupiter-family comet 103P/Hartley 2, which 
originated in the Kuiper belt, is (1.61 + 0.24) x 10~*. This result 
substantially expands the reservoir of Earth ocean-like water to 
include some comets, and is consistent with the emerging picture 
of a complex dynamical evolution of the early Solar System*”. 

On 17 November 2010, using the Herschel Space Observatory, we 
determined the D/H ratio in a comet from a reservoir other than the 
Oort cloud—103P/Hartley 2. Such Jupiter-family comets are believed 
to originate from the Kuiper belt, which exists beyond the orbits of the 
giant planets at radii between 30 and 50 astronomical units’? (1 AU is 
the average Earth-Sun distance). In contrast, Oort-cloud comets are 
theorized to have originated from radii near the gas giants and to have 
been subsequently ejected to the Oort cloud (>5,000 au)". The 
Herschel measurement therefore traces the water D/H ratio in a new 
population of water-ice-rich bodies in the Solar System that are a 
potential source of water on the Earth. 

To obtain an accurate determination of the D/H ratio in water, we 
carried out simultaneous observations of optically thin isotopic var- 
iants of water, specifically HDO and H,"%0 (Fig. 1), as part of our Solar 
System observing programme’. This was critical for comet 103P/ 
Hartley 2, whose activity and water outgassing rates exhibited signifi- 
cant short-term variations’*. We used state-of-the-art excitation models 
to determine the HDO and H,'*O beam integrated column densities 
and production rates from the measured line intensities. Observation 
and modelling details are given in Supplementary Information. A 
critical point is that all observations sampled the same region of the 
coma, about 6,500 km in diameter. 

The retrieved gas column densities and production rates are sensitive 
to collisional cross-sections, along with the density and temperature 
profiles of H,O and electrons, and we thus considered a range of model 
parameters (Table 1). Although the production rates determined for the 
various model parameters differ slightly, the value of the D/H ratio is 
estimated to be (1.61 + 0.24) X 10 *. In our analysis, we assumed an 
H,'°O/H,'80 ratio of 500 + 50, a range that encompasses the Earth 
value and is consistent with previous measurements in cometary 
water’ (see also Supplementary Information). The quoted 1¢ uncer- 
tainty in the D/H ratio includes a 5% uncertainty related to modelling. 


Our measured D/H value is substantially larger than that which 
characterized the young Sun (4.5Gyr ago; the protosolar ratio), 
believed to be about 2.1 X 107°, which in turn is slightly higher than 
the value found in the local interstellar medium today (1.6 x 10 °)and 
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Figure 1 | Submillimetre water emission lines from comet 103P/Hartley 2. 
The time of the observations was 20 days after perihelion, when the comet was 
1.095 au from the Sun and 0.212 au from Herschel. Because the H,O ground 
state rotational lines in comets are optically thick’””°, observations of the rare 
oxygen isotopic counterpart, H'*O, provide a more reliable reference for the 
D/H determination. The spectra of the 19-10) lines of HDO (a) and H,'%0 
(b) at 509.292 and 547.676 GHz, respectively, were obtained with the 
Heterodyne Instrument for the Far Infrared (HIFI) High Resolution 
Spectrometer (HRS) between 17.28 and 17.64 November 2010 ur. The line 
intensities, expressed in the main-beam brightness temperature scale, are 
0.011 + 0.001 and 0.117 + 0.002 Kkms  ', for HDO and H,'*0 respectively, 
averaging the two instrument polarizations. The velocity scale is given relative 
to the velocity of the comet’s nucleus. The spectral resolution is 141 and 
132ms | for the HDO and H,'*O spectra, respectively. For details of the 
observational sequence and basic parameters of the data analysis, see 
Supplementary information. 


1Max-Planck-Institut fir Sonnensystemforschung, Max-Planck-Str. 2, 37191 Katlenburg-Lindau, Germany. California Institute of Technology, Pasadena, California 91125, USA. SLESIA-Observatoire de 
Paris, CNRS, UPMC, Université Paris-Diderot, 5 place Jules Janssen, 92195 Meudon, France. *Rosetta Science Operations Centre, European Space Astronomy Centre, 28691 Villanueva de la Cafiada, 
Madrid, Spain. °Astronomy Department, University of Michigan, Ann Arbor, Michigan 48109, USA. °Space Research Centre, Polish Academy of Sciences, 00-716 Warsaw, Poland. 
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Table 1 | Calculating the D/H ratio in water in comet 103P/Hartley 2 


Model Tgas  Xne <N(HDO)>  Q(HDO) <N(H2'80)> — Q(H2180) D/H 

(K) (10!°cm~?) (1074s~t) (10!!cm=?) (1025 s~1) 

(1) 50 0. 49 3.1 3.5 2.1 1.49 x 10-4 
50 0.2 3.6 2.3 2.5 1:5 1.55 x 10-4 
70 0.2 3.7 2.4 25 1:5 1.60 x 10-4 
Law 0.1 shy 3.6 3.7 2.2 1.63 x 10-4 

(2) 50 0.2 43 2.7 2.9 1.8 1.54 x 10-4 
Law 0.1 48 311 3.2 19 1.58 x 10-4 


The parameter xX,¢, Scaling the electron density profile in the models, is constrained by mapping 
observations. <N> and Q are respectively beam integrated column density and production rate, 
determined using different parameters in the excitation models'**°. Production rates were computed 
assuming isotropic outflow of water from the nucleus, with a velocity of 0.6 kms‘, consistent with the 
width of the H2!80 line. We accounted for the 10’ offset between the centre of the beam and the 
position of the peak of the H20 distribution. Values of 50 and 70 K for the gas kinetic temperature, Tyas, 
are consistent with multi-transition measurements of gaseous species in the millimetre and near- 
infrared range, respectively. The gas kinetic temperature is expected to decrease with increasing 
distance owing to quasi-adiabatic expansion of the escaping gases: the temperature law assumes that 
Tgas = 80 K for r< 270 km, Tgas = 12 K for r> 630 km, with a linear decrease between 270 and 630 km, 
where ris the distance from the nucleus. Collision cross-sections involving water molecules and 
electrons are modelled differently in models (1) and (2). Both models use an electron density profile 
based on in situ measurements in comet 1P/Halley scaled to the activity of 1O3P/Hartley 2 (ref. 29). The 
D/H ratio is equal to 0.5 x Q(HDO)/Q(H20), with Q(H20) = 500 x Q(H2'80). See Supplementary 
information for details of the models and model parameters. 


comparable to the primordial D/H ratio in the Universe after the Big 
Bang (Fig. 2)'°. Protosolar water, on the other hand, is believed to be 
highly enriched (D/H ~ 1X10 *)'® due to the low-temperature 
(~10-30 K) non-equilibrium chemistry that characterizes the dense 
interstellar medium”, either via gas-phase isotopic exchange reactions 
involving ions and radicals, or grain-surface processes. Consequently, 
the resulting D/H ratio in water ice is very sensitive to the physical 
conditions, in particular the kinetic temperature of the medium. After 
the protosolar cloud collapsed to form the solar nebula, isotopic 
exchange reactions between molecular hydrogen and HDO molecules 
would have led to a gradual reduction of D/H in water'®, as compared 
to the initial interstellar medium value. Because the efficiency of these 
reactions and the turbulent mixing within the solar nebula is correlated 
with the gas density and temperature, the deuterium enhancement in 
water has been predicted to increase with the heliocentric distance’”’. 
Ices, captured by planetesimals and cometesimals, would have then 
preserved the deuterium enrichment in water from this early epoch. As 
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Figure 2 | D/H ratios in the Solar System. Orange squares, values measured 
for water in the Oort-cloud comets 1P/Halley, C/1996 B2 (Hyakutake), C/1995 
O1 (Hale-Bopp), C/2002 T7 (LINEAR) and 8P/Tuttle. Arrow (for 153P/Ikeya- 
Zhang), upper limit. Purple square, present measurement in the water of 103P/ 
Hartley 2. Black symbols, D/H ratio in H, in the atmosphere of the giant 

planets—Jupiter (J), Saturn (S), Uranus (U) and Neptune (N). Light blue and 
green symbols, D/H values for water in the plume of Saturn’s moon Enceladus 
and in CI carbonaceous chondrites, respectively. Error bars, 1o. The D/H 
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a result, small Solar System bodies are expected to exhibit different 
D/H ratios in their water ice depending on the distance from the Sun at 
which they were formed. 

In the context of this simple nebular model, the D/H ratio of 
(1.61 + 0.24) X 10 * in comet 103P/Hartley 2—a factor of two lower 
than that measured in Oort-cloud comets (Fig. 2) and, within un 
certainties, consistent with that of the Earth’s oceans (for which the 
Vienna Standard Mean Ocean Water (VSMOW) value is 
(1.558 + 0.001) X 10 *)—is therefore surprising, and compatible 
with two different schemes: (1) either this comet did not form in a 
region that was further from the Sun than the assembly zone of the 
Oort-cloud comets, or (2) the dependence of the water D/H ratio with 
distance from the Sun is not as expected on the basis of current models. 
Concerning the first possibility, dynamical models indeed suggest that 
a fraction of the Jupiter-family comets originate in the Oort cloud”. 
Still, even if comet 103P/Hartley 2 stems from the Oort cloud, this 
would not explain why its D/H ratio is different from that seen in other 
Oort-cloud comets. Models also suggest that a fraction of the Jupiter- 
family comets may have originated from the Trojan asteroid swarms 
sharing the orbit of Jupiter”. The Trojans are generally thought to have 
resided at their current location since the formation of the Solar 
System. Therefore, Jupiter family comets originating in the Trojan 
region could, in theory, display deuterium enrichment values lower 
than those for bodies originating in the Kuiper belt, if they indeed 
formed in the vicinity of Jupiter. However, the most probable scenario 
is that 103P/Hartley 2 originated in the Kuiper belt. 

It is difficult to explain the low D/H ratio in 103P/Hartley 2 (com- 
pared to that of previous measurements in comets) with the formation 
regions of comets, thus models of the gradient of D/H in the Solar 
System—predictions not yet directly confirmed by observations, 
owing to scarcity of accurate isotopic measurements—may need to 
be revisited. In fact, one recent model has suggested that the D/H ratio 
of water vapour can be locally enhanced™*. However, the vapour must 
then be implanted into cometary ices. Moreover, until the measure- 
ment of 103P/Hartley 2 there was no observational confirmation of 
variations in the D/H ratio. One possible solution is that there was 
large-scale movement of material between the inner and outer Solar 
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determinations in comets originating from the Oort cloud are twice the value 
for the Earth’s ocean (blue line) and about a factor of ten larger than the 
protosolar value in H, (broad yellow line), the latter being comparable to the 
value in atomic hydrogen found in the local interstellar medium (ISM, red 
horizontal line). The D/H ratio in the Jupiter-family comet 103P/Hartley 2 is 
the same as the Earth’s ocean value and the chondritic CI value. Uranus and 
Neptune have been enriched in deuterium by the mixing of their atmospheres 
with D-rich protoplanetary ices. For further details, see Supplementary Table 1. 
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System. According to a recent theory proposed for the early Solar 
System (the Grand Tack scenario), when the giant planets were still 
embedded in the nebular gas disk, there was a general radial mixing of 
the distribution of comets and asteroids born in different regions”. 
The similarity of the D/H ratio in comet 103P/Hartley 2, which probes 
the Kuiper belt, with that found in CI chondrites tracing the asteroid 
belt, would be in agreement with a general shake-up of the Solar 
System at early times. 

In more ‘orderly’ models, the high D/H values derived from the 
earlier observations of Oort-cloud comets suggested that at most 
10% of the Earth’s water could have been supplied from the outermost 
Solar System, but even under these circumstances a number of scenarios 
have been developed suggesting that terrestrial water could have in fact 
been delivered by comets. Such models are based on assumptions about 
the heliocentric D/H gradient”, and the analysis of lunar samples”’ and 
telluric sedimentary rocks formed at the end of the Late Heavy 
Bombardment phase”. Our Herschel observations of a VSMOW-like 
D/H ratio in 103P/Hartley 2 enlarge the region of the solar nebula 
known to have a D/H ratio similar to that of Earth’s oceans; this region 
now includes both the asteroid belt and the much larger Kuiper belt, 
thereby providing support for the theory of a common water source for 
the inner Solar System bodies (including the Earth) in which comets 
play an important part. 

Further constraints on the delivery of volatiles to the early Earth and 
an improved understanding of the origin of the different dynamical 
classes of comets will require significantly larger sample sizes than 
those at present available. A handful of additional measurements can 
be expected from Herschel before its cryogen supply is exhausted, but 
the comparison of D/H ratios in the inner and outer Solar System must 
necessarily utilize very different objects and materials. For the inner 
Solar System, in situ space missions or sample return missions to the 
outer asteroid belt would provide critical new data. Astronomically, the 
minuscule strength of HDO spectroscopic signatures makes D/H mea- 
surements extremely challenging, and dedicated programmes using 
new facilities will be required to substantially increase the inventory 
of high-precision D/H ratios in comets and other icy Solar System 
bodies, including the Jovian satellites. 
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Chemical ozone destruction occurs over both polar regions in local winter-spring. In the Antarctic, essentially complete 
removal of lower-stratospheric ozone currently results in an ozone hole every year, whereas in the Arctic, ozone loss is 
highly variable and has until now been much more limited. Here we demonstrate that chemical ozone destruction over 
the Arctic in early 2011 was—for the first time in the observational record—comparable to that in the Antarctic ozone 
hole. Unusually long-lasting cold conditions in the Arctic lower stratosphere led to persistent enhancement in 
ozone-destroying forms of chlorine and to unprecedented ozone loss, which exceeded 80 per cent over 18-20 
kilometres altitude. Our results show that Arctic ozone holes are possible even with temperatures much milder than 
those in the Antarctic. We cannot at present predict when such severe Arctic ozone depletion may be matched or 


exceeded. 


Since the emergence of the Antarctic ‘ozone hole’ in the 1980s' and 
elucidation of the chemical mechanisms” and meteorological con- 
ditions® involved in its formation, the likelihood of extreme ozone 
depletion over the Arctic has been debated. Similar processes are at 
work in the polar lower stratosphere in both hemispheres, but differ- 
ences in the evolution of the winter polar vortex and associated polar 
temperatures have in the past led to vastly disparate degrees of spring- 
time ozone destruction in the Arctic and Antarctic. We show that 
chemical ozone loss in spring 2011 far exceeded any previously 
observed over the Arctic. For the first time, sufficient loss occurred 
to reasonably be described as an Arctic ozone hole. 


Arctic polar processing in 2010-11 


In the winter polar lower stratosphere, low temperatures induce 
condensation of water vapour and nitric acid (HNOs) into polar 
stratospheric clouds (PSCs). PSCs and other cold aerosols provide 
surfaces for heterogeneous conversion of chlorine from longer-lived 
reservoir species, such as chlorine nitrate (CIONO 2) and hydrogen 
chloride (HC]), into reactive (ozone-destroying) forms, with chlorine 
monoxide (ClO) predominant in daylight’. 

In the Antarctic, enhanced ClO is usually present for 4-5 months 
(through to the end of September)*™’, leading to destruction of most 
of the ozone in the polar vortex between ~14 and 20 km altitude’. 
Although ClO enhancement comparable to that in the Antarctic 
occurs at some times and altitudes in most Arctic winters’, it rarely 
persists for more than 2-3 months, even in the coldest years’. Thus 
chemical ozone loss in the Arctic has until now been limited, with 
largest previous losses observed in 2005, 2000 and 1996”"7-"*. 

The 2010-11 Arctic winter-spring was characterized by an 
anomalously strong stratospheric polar vortex and an atypically long 
continuously cold period. In February-March 2011, the barrier to 


transport at the Arctic vortex edge was the strongest in either hemi- 
sphere in the last ~30 years (Fig. 1a, Supplementary Discussion). 

The persistence of a strong, cold vortex from December through to 
the end of March was unprecedented. In the previous years with most 
ozone loss, temperatures (T) rose above the threshold associated with 
chlorine activation (T,.., near 196K, roughly the threshold for the 
potential existence of PSCs) by early March (Fig. 1b, Supplementary 
Figs 1, 2). Only in 2011 and 1997 have Arctic temperatures below Ti 
persisted through to the end of March, sporadically approaching a 
vortex volume fraction similar in size to that in some Antarctic winters 
(Fig. 1b). In 1996-97, however, the cold volume remained very limited 
until mid-January and was smaller than that in 2011 at most times 
during late January through to the end of March (Fig. 1b, Supplemen- 
tary Figs 1, 2). 

Daily minimum temperatures in the 2010-11 Arctic winter were 
not unusually low, but the persistently cold region was remarkably 
deep (Supplementary Figs 1, 2). Temperatures were below T,; for 
more than 100 days over an altitude range of ~15-23 km, compared 
to a similarly prolonged cold period over only ~20-23 km altitude in 
1997; below ~19 km altitude, T< T,., continued for ~30 days longer 
in 2011 than in 1997 (Supplementary Fig. 1b). In 2005, the previous 
year with largest Arctic ozone loss’, T< T,, occurred for more than 
100 days over ~17-23 km altitude, but all before early March. 

The winter mean volume of air in which PSCs may form (that is, 
with T < Tact), Vpsc: is closely correlated with the potential for ozone 
loss”'*"'7, In 2011, Vpse (as a fraction of the vortex volume) was the 
largest on record (Fig. 1c). Both large V,;- and cold lingering well into 
spring are important in producing severe chemical loss”'>’*, and 
2010-11 was the only Arctic winter during which both conditions 
have been met. Much lower fractional V,,.. in 1997 than in 1996, 2000, 


psc 
16,17 


2005 or 2011 (Fig. 1c) is consistent with less ozone loss that year 
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Figure 1 | Meteorology of the Arctic lower stratosphere. a, Vortex strength 
(as indicated by maximum potential vorticity” (PV) gradients) at 460 K 
potential temperature (~18 km altitude, ~65 hPa level). b, Fraction of vortex 
volume at potential temperatures between 390 and 550 K with a temperature 
less than the chlorine activation threshold (T,-:). Light (dark) grey shading 
shows range of Arctic (Antarctic) values for 1979-2010. Antarctic dates are 
shifted by six months (top axis in a) to show the equivalent season. c, Winter 
mean V,,.- during the past 32 years, expressed as a fraction of vortex volume. 
Red, orange, green, purple and blue lines/bars show the 2010-11, 2004-05, 
1999-2000, 1996-97 and 1995-96 Arctic winters, respectively. 


Factors playing secondary parts in governing interannual variability 
in ozone destruction, including vortex strength, structure and posi- 
tion relative to the cold region, also favour large loss in 2011 (Sup- 
plementary Figs 2, 3, Supplementary Discussion). However, despite 
the fraction of the vortex with T < T,., and mid-March temperatures 
sporadically approaching those seen in the Antarctic (Fig. 1b, 
Supplementary Fig. 1a), even in 2011 temperatures were much higher, 
and the cold regions much smaller, than those in most Antarctic 
winters. 

Satellite trace-gas and PSC measurements highlight the stark con- 
trast between polar processing in 2010-11 and that in typical Arctic 
winters, and the parallels with Antarctic conditions (Figs 2, 3). In 2011, 
PSCs or aerosols were abundant until mid-March (Fig. 3a; consistent 
with a deep region with T’< T,., Fig. 3b), much later than usual in the 
Arctic'* °°, with vortex-average amounts at some altitudes similar to 
those in the Antarctic and dramatically larger than the near-zero values 
at that time in most Arctic winters. Furthermore, PSCs in 2011 
spanned an altitude range comparable to that in the Antarctic, an 
uncommon occurrence in the Arctic'*”°. Particles in long-lasting 
PSCs can grow large enough to sediment, resulting in denitrification, 
permanent removal of HNO; from the stratosphere””*. By late March 
2011 no PSCs remained (Fig. 3a), yet HNO3 mixing ratios were much 
lower than observed in any previous Arctic winter (Fig. 2a). The con- 
tinuing depression in HNO; after PSCs had evaporated indicates 
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denitrification. Albeit less severe than in typical Antarctic winters 
(Fig. 2b, c, 3c), the extent and degree of denitrification in 2011 were 
unmatched in the Arctic, approaching the range of Antarctic condi- 
tions for the first time. 

Decreasing HCl and increasing ClO signify chlorine activation 
(Fig. 2d-i). Some ClO enhancement has occurred in all recent 
Arctic winters, but has never been as prolonged and extensive as that 
in 2011. In late February, high ClO pervaded the sunlit portion of the 
vortex. The 2011 values vastly exceed the range previously observed in 
the Arctic from late February through to the end of March. They also 
briefly lie outside the Antarctic seasonal envelope, primarily because 
the higher solar zenith angles of the Antarctic measurements used 
here lead to ~30% lower ClO under fully activated conditions. In late 
February, HCl values (unaffected by solar zenith angle issues) fall 
along the lower boundary of the Antarctic envelope, confirming the 
picture seen in ClO. The vertical extent of chlorine activation was also 
comparable to that in the Antarctic (Fig. 3d, e). 

In previous cold Arctic winters, chlorine was deactivated (converted 
from ozone-destroying forms into less reactive reservoir species) by 
mid-March"; even in 1997, ClO started to decline by late February 
(Fig. 2g). In 2011, by contrast, ClO began decreasing rapidly only about 
a week earlier than is typical in the Antarctic. ClO data in late February 
1997 indicate that not only were maximum values lower than those in 
early March 2011, but also the vertical range of enhancement was 
shallower, with weaker activation at low altitudes than in 2011 
(Fig. 3e), consistent with the higher altitudes and decreasing extent 
(Figs 1b, 3b, Supplementary Fig. 2) of T< Tact. 

When chlorine is deactivated, whether it is converted first into HCl 
or CIONO; depends sensitively upon HNO; and ozone abundances. In 
the Arctic, chlorine is normally deactivated through initial reformation 
of CIONO). In the severely denitrified and ozone-depleted Antarctic 
vortex, production of CIONO, is suppressed and that of HCl highly 
favoured’?*", In March 2011, the recovery of HCl followed a much 
more Antarctic-like pathway than has been observed in any other 
Arctic winter. 

The largest Arctic chemical ozone loss was previously observed in 
2005, followed closely by 2000 and 1996”'*""*. Although low tempera- 
tures persisted until the end of March 1997, the ozone loss in that year 
was far less. No previous year rivals 2011, when the evolution of Arctic 
ozone more closely followed that typical of the Antarctic (Fig. 2j). Ozone 
profiles in late March 2011 resemble typical Antarctic late-winter pro- 
files much more strongly than they do the average Arctic one (Fig. 3f). 
Because mixing in April 2011 (for example, lamination events larger 
than that shown in Fig. 3f) entrained ozone-rich air into the vortex, the 
slight decrease in vortex-averaged ozone at a potential temperature of 
485 K from 26 March to 20 April (from ~1.8 to ~1.6 p.p.mv., Fig. 2j) 
indicates continuing chemical loss during this interval. 


Estimates of chemical ozone loss 


Chemical loss is difficult to quantify in the Arctic, where transport 
from above replenishes ozone in the lower stratospheric vortex, 
obscuring the signature of chlorine-catalysed destruction’*”*”*. The 
evolution of the long-lived trace gas nitrous oxide (NO) reflects steady 
downward transport throughout the 2010-11 winter-spring, indi- 
cating that subsidence partially masked chemical loss. Horizontal 
transport can also confound the signature of chemical loss, bringing 
air into the vortex that has either higher** or lower’* concentrations of 
ozone, depending on the altitude and latitude from which it originates. 

Representative results from two types of chemical loss calcula- 
tions**”* based on balloon-borne and satellite observations are shown 
in Fig. 4. The differences (up to ~0.4p.p.m.v. at the end of March 
2011) in estimates derived from the various methods and data sets 
imply some uncertainty in the chemical loss determination. Year-to- 
year differences in the amount of ozone loss are very similar when 
obtained from any method/data set combination, however, indicating 
a high degree of precision in the relative amount of calculated loss 
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Figure 2 | Chemical composition in the lower stratosphere. a-1, Maps (right) 
and vortex-averaged time series (left) at 485 K potential temperature (~20 km, 
~50 hPa) for four different gases: HNO; (a, b, c), HCl (d, e, f), ClO (g, h, i) and 
Os; (ozone; j, k, 1); mixing ratios from Aura MLS are shown. Averaging for the 
time series is done within the white contour shown on the maps. Blue (purple) 
triangles on time series, 1995-96 (1996-97) values from UARS MLS. Line 

colours/shading as in Fig. 1, but shading is for Aura MLS measurements from 


between different years. Chemical destruction was severe between 
~16 and 22 km altitude, with the largest loss exceeding 2.5 p.p.m.v. 
by 26 March 2011 (Fig. 4a). By 31 March 2011, chemical loss was 
nearly double that in 2005 from ~18 to above 22 km, and similar to 
that in 2005 at lower altitudes (Fig. 4b, c). From ~18 to 20 km, more 
than 80% of the ozone present in January had been chemically 
destroyed by late March. Chemical removal in 1996 and 2000 started 
at a rate similar to that in 2011 (Fig. 4c), but ceased by late March; 
maximum losses in 2000 approached those in 2011, but extended over 
a much smaller vertical range (Fig. 4b). Loss in 1996, 2000 and 2005 
considerably exceeded that in 1997, with greater destruction at lower 
altitudes in those years contributing more to total column loss”'*”’. 
Chemical loss in 2011 was two to three times larger than that in 1997, 
and about twice that in 1996 and 2005 above ~16 km; from ~15 to 
23 km it was comparable to that in the Antarctic ozone hole in 1985”’. 


2005-10. Antarctic dates are shifted by six months (top axis on time series) to 
show the equivalent season. Vertical lines show dates of maps in 2011 (2010) in 
the Arctic (Antarctic). Black overlays on HNO; maps, T,< (~196 K at this 
level); HNO; may be sequestered in PSCs at lower temperatures. Dotted black/ 
white contour on ClO maps, 92° SZA, poleward of which measurements were 
taken in darkness. Yellow/black triangles on ozone maps, locations of the 
profiles in Fig. 3. 


Single ozone-sonde station measurements in early April 2011 suggest 
continuing ozone loss (Fig. 4c). 

Although the meteorology during March-April was similar in 1997 
and 2011, ozone loss was much more pronounced in 2011. Photo- 
chemical box model simulations (Supplementary Fig. 4, Supplemen- 
tary Discussion) elucidate how early winter conditions set the stage for 
record springtime ozone destruction in 2011. Chlorine activation 
brought on by enduring cold from December through to the end of 
February led to ~0.7-0.8 p.p.m.v. lower ozone at the beginning of 
March 2011 (Figs 2j, 4c). The early onset of continuous cold also 
facilitated formation of PSC particles large enough to sediment, result- 
ing in ~4 p.p.b.v. less HNO; by March in 2011 than in 1997 (Fig. 2a). 
The degree of denitrification has a profound impact on the severity of 
springtime Arctic ozone loss*. By delaying chlorine deactivation, 
lower HNO; by 1 March was responsible for ~0.6 p.p.m.v. more ozone 
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Figure 3 | Vertical composition information. a, Red, PSCs/aerosol amounts 
averaged in the vortex over a week centred around 25 February 2011; dark blue, 
the average for the same week in 2007-10; grey, the average over the equivalent 
period (centred on 28 August) for the Antarctic in 2006-10; lavender, the Arctic 
average for a week centred around 26 March 2011. (In late winter-spring, 
maximum PSC altitudes are generally higher in the Arctic because early winter 
PSC activity redistributes HNO3 and water vapour to lower altitudes in the 
Antarctic'*). b-f, Daily average profiles of MERRA temperatures (b) and MLS 
HNOs (c), HCl (d), ClO (e) and ozone (f). Red lines, data from a 4° X 15° 
latitude X longitude box around 79° N, 12° E; in ¢, f, taken on 26 March; in 


loss after that date in 2011 than in 1997 (Supplementary Fig. 4, 
Supplementary Discussion). The effects of denitrification and early- 
winter loss together account for the disparity in ozone depletion in 
these two winters (~1.5 p.p.m.v. more loss at 460 K in 2011 than in 
1997, Fig. 4c, Supplementary Fig. 4). Loss as severe as that in 2011 thus 
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b, d, e, on 6 March 2011. Lavender, 7-day average for 2005-10 (1980-2010 for 
b) centred on the same location and days. Grey, profiles in a similar box in the 
Antarctic (79° S, 12° E) on 26 September for c, f, and on 8 September 2010 for 
b, d, e. Dotted black line in b, approximate T,,, (195 K), see text. Purple line in 
b, 7-day average around 6 March 1997, centred on same location. Purple line in 
e, a midday ClO profile from UARS MLS on 26 February 1997 averaged in an 
8° X 30° box centred at the same Arctic location. A high-resolution ozone- 
sonde profile at Ny Alesund on 26 March 2011 (black in f) agrees well with 
MLS; lamination, a signature of mixing with ozone-rich extra-vortex air, is 
apparent as a local maximum near 60 hPa. 


requires T<T,.,, with consequent chlorine activation and ozone 
destruction, early in winter (as in 1996, 2000 and 2005, but not in 
1997), a cold period and region before March sufficient to allow wide- 
spread denitrification, and the persistence of a cold polar vortex into 
April (as in 1997, but not in 1996, 2000 or 2005). 
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Figure 4 | Chemical ozone loss estimates. a, Chemical loss as a function of 
time and potential temperature from passive subtraction of MLS and ATLAS 
passively-transported ozone (initialized with December MLS data). 

b, Chemical loss from ozone sondes in unmixed vortex air as a function of 
‘spring equivalent potential temperature™* (black contours in a). Shading, 
Antarctic range defined by 1985 (the first year with profile measurements 
inside the ozone hole”) and 2003 (a recent year with a severe ozone hole). The 
2003 Antarctic curve is shifted by six months minus 10 days because ozone 
sondes that year predominantly sampled the outermost vortex, where ozone 
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loss begins earliest. c, Ozone at a spring equivalent potential temperature of 
465K (white contour in a), near the level of maximum chemical loss. Shading, 
the region below the minimum reached in the 1985 Antarctic ozone hole. In 
April 2011 most soundings sampled the disturbed vortex edge; only two were 
made in air uninfluenced by mixing (red dots). Error bars, 1o uncertainties 
based on the scatter of individual ozone-sonde measurements. Line colours as 
in Fig. 1; 1998-99 (a winter with no ozone loss) is shown in cyan. NH, Northern 
Hemisphere; SH, Southern Hemisphere. 
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Column ozone 


Total column ozone is a predominant factor determining exposure of 
Earth’s surface to ultraviolet radiation”’’. In the context of previous 
Arctic winters, 2011 was truly remarkable: the fraction of the Arctic 
vortex in March with total ozone less than 275 Dobson units (DU) is 
typically near zero, but reached nearly 45% in 2011 (Fig. 5a). Because 
of the dynamically-driven correlation between total ozone and lower- 
stratospheric temperature***'** (Supplementary Discussion), the 
abiding cold in 1997 and 2011 would have led to lower March total 
ozone than in other Arctic winters even without chemical loss; 
dynamical conditions in March-April 1997 particularly favoured 
low total ozone*’ (Supplementary Discussion). In March 2011, however, 
the area of low total ozone covered more than twice as much of the 
vortex as in 1997, and the daily vortex ‘ozone deficit’ (Supplementary 
Fig. 5a) was 30-50 DU larger, consistent with the greater chemical loss 
(Fig. 4). Maximum 2011 vortex fractions of low ozone approached 
those in early Antarctic ozone holes (Fig. 5a). The close correspond- 
ence between the vortex and both low total ozone and the large Arctic 
total ozone deficit (Fig. 5b, d) implies that low total ozone in March 
2011 resulted primarily from chemical loss*’? (Supplementary 
Discussion). The ozone deficit in the Antarctic (Fig. 5e) shows a 
maximum over 0-90° W, and a minimum over 90-200” E, reflecting 
a vortex position in 2010 different to that in the reference state (which 
is less robust than that for the Arctic). Differences in morphology deep 
in the vortex are, however, minimal. The 2011 Arctic ozone deficit was 
at least comparable to that in the 2010 Antarctic vortex core at an 
equivalent time. 
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Figure 5 | Total column ozone. a, Time series of the fraction of 460 K vortex 
area with total ozone below 275 Dobson units (DU) in February—April in the 
Arctic (bottom axis), and in August-October in the Antarctic (top axis). Line 
colours/shading as in Fig. 1. 2005-2011 values are from OMI; earlier values are 
from TOMS (Total Ozone Mapping Spectrometer) instruments”’. Maps show 
OMI total ozone (b, c) and ozone deficit (d, e) in the Arctic (Antarctic) on 
26 March 2011 (26 September 2010). Overlays as in Fig. 2 but at 460 K. 
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An echo of the Antarctic 


In the absence of chemical ozone loss, downward transport during 
winter results in a springtime maximum in total ozone; because this 
transport is stronger in the Arctic, background ozone levels there are 
~100 DU higher than those in the Antarctic’. Therefore Arctic 
spring total ozone could, even after chemical destruction comparable 
to that in an Antarctic ozone hole (commonly defined by values less 
than 220 DU; refs 7, 12), exhibit only a weak maximum in total ozone 
rather than a well-defined minimum. Examination of the long-term 
ozone-sonde record in the Arctic shows that abundances near 250 DU 
or less are well below typical autumn values, thus appearing as a ‘hole’ 
in total ozone. Dynamical processes can result in transient regions of 
very low total ozone (Supplementary Discussion, Supplementary Figs 
5, 6) and/or local minima in lower-stratospheric ozone profiles (for 
example, via ozone-poor extra-vortex air transported into the polar 
vortex'***), For an interhemispheric comparison of chemical loss, it is 
thus important to verify that observed Arctic ozone decreases were 
primarily related to chemical, rather than dynamical, processes. 

Figure 4 shows that the precipitous decline in Arctic ozone in 
February-March 2011 resulted from chemical loss of similar mag- 
nitude to that in the Antarctic in the mid-1980s. Observed ozone 
between ~15 and 20km altitude decreased to values matching the 
minima in early Antarctic ozone holes and those reached at the cor- 
responding time in some recent Antarctic winters (Figs 2j-]; 3f). In 
late March-early April, most ozone-sonde profiles in the vortex had 
mixing ratios less than 1 p.p.m.v., with values ~0.7 p.p.m.v. over an 
approximately 2-km altitude region, and some dipping to 0.5 p.p.m.v. 
(Supplementary Fig. 7). Minimum total ozone in spring 2011 was 
continuously below 250 DU for ~27 days (Supplementary Fig. 5b), 
with a maximal area below that level of ~2 x 10°km/ (roughly five 
times the area of Germany or California). Values dropped to ~220- 
230 DU for about a week in late March 2011. 

In these respects, chemical ozone destruction in the 2011 Arctic 
polar vortex attained, for the first time, a level clearly identifiable as an 
Arctic ozone hole. On the other hand, although the magnitude of 
chemical depletion was comparable to that in the Antarctic, total 
ozone values remained higher and, because the areal extent of the 
Arctic vortex was much smaller (~60% the size of a typical 
Antarctic vortex), the low-ozone region was more confined. 

The Arctic winter stratosphere exhibits striking interannual vari- 
ability. The past decade has included the four most dynamically active 
(hence among the warmest) Arctic winters in the past 32 years (ref. 
35) and now the two coldest winters with largest ozone loss”'*""*, 
extending the previously noted trend of the coldest winters becoming 
colder’*'*. Had implementation of the Montreal Protocol not curbed 
the increase in stratospheric halogen loading, formation of an Arctic 
ozone hole would have already become common even in moderately 
cold winters*®. Even with the lower anthropogenic halogen levels 
actually reached, the potential for Antarctic-like ozone loss in the 
Arctic in the event of a persistently cold winter-spring such as that 
in 2010-11 has been recognized for decades*”’. Despite temperatures 
that were generally far higher than those in Antarctic winter, Arctic 
chemical ozone destruction in 2011 rivalled that in some Antarctic 
ozone holes. The development of an Arctic ozone hole under condi- 
tions only slightly more extreme than those in some previous Arctic 
winters raises the possibility of yet more severe depletion as lower- 
stratospheric temperatures decrease. More acute Arctic ozone 
destruction could exacerbate biological risks from increased ultra- 
violet radiation exposure, especially if the vortex shifted over densely 
populated mid-latitudes, as it did in April 2011. 

Our present understanding of what drives variability in the Arctic 
winter stratosphere is incomplete. Stratospheric temperatures and 
vortex evolution depend on the atmosphere’s radiative properties 
and propagation of wave activity’’**, which are being modified by 
increasing greenhouse gas concentrations. Day-to-day tropospheric 
disturbances can lead to stratospheric warming or cooling, depending 
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on their geographical location and the stratospheric vortex structure, 
which controls their upward propagation”. Current climate models 
do not fully capture either the observed short-timescale patterns of 
Arctic variability or the full extent of the observed longer-term cooling 
trend in cold stratospheric winters; nor do they agree on future cir- 
culation changes that affect trends in transport*’*. Our ability to 
predict when conditions similar to, or more extreme than, those in 
2011 may be realized is thus very limited. Improving our predictive 
capabilities for Arctic ozone loss, especially while anthropogenic 
halogen levels remain high, is one of the greatest challenges in polar 
ozone research. Comprehensive stratospheric data sets, such as those 
used here, are critical to meeting that challenge. 


METHODS SUMMARY 


MERRA (Modern Era Retrospective-analysis for Research and Applications 
fields are used for temperature and vortex analysis and for vortex averaging of 
composition measurements. The CALIOP (Cloud-Aerosol Lidar with 
Orthogonal Polarization) on the CALIPSO (Cloud-Aerosol Lidar and Infrared 
Pathfinder Satellite Observations) satellite** provides PSC/aerosol information. 

Trace gas profiles are from the Microwave Limb Sounder (MLS)* on NASA’s 
Aura satellite. Only daytime ClO measurements are used. Northern (southern) 
high latitudes are sampled near midday (in late afternoon), thus the average solar 
zenith angle (SZA) of MLS Antarctic measurements is ~7° higher than that in the 
Arctic. Reactive chlorine partitioning shifts away from ClO at higher SZAs”!, 
leading to ~30% lower ClO measured in the Antarctic than in the Arctic under 
fully activated conditions. An instrument anomaly disrupted MLS measurements 
from 27 March to 20 April 2011. UARS (Upper Atmosphere Research Satellite) 
MLS measurements, used for 1995-1996 and 1996-1997 analyses, are sparse 
because of the UARS yaw cycle and other measurement gaps”*. 

Total column ozone is measured by the Dutch-Finnish Ozone Monitoring 
Instrument (OMI)** on Aura. Total ozone ‘deficit’ is the difference between daily 
values and a reference that is minimally affected by chemical loss. 

Measurements from MLS and the Match network of balloon-borne ozone 
soundings (ozone sondes)” are used to estimate chemical ozone loss in two ways. 
The difference between calculated ‘passive’ (influenced only by transport) ozone 
and observed ozone is computed, with passive ozone obtained using MLS nitrous 
oxide", a ‘reverse trajectory’ model’*”*, and the ATLAS (Alfred Wegener Institute 
Lagrangian Chemistry/Transport System) model”. Vortex ozone is also examined 
on the surfaces on which it subsides!*!*?*"*, with descent rates from modelled 
radiative heating/cooling rates averaged over the polar vortex**. 

Photochemical box model runs were performed using the chemical model 
from ATLAS” to test the sensitivity of ozone loss to initial ozone amounts and 
denitrification. 
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Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Data sets. Modern Era Retrospective-analysis for Research and Applications 
(MERRA)** fields, from the Goddard Earth Observing System Version 5.2.0 
(GEOS-5) data assimilation system, are used for the temperature and vortex 
analysis. The Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) on 
the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations 
(CALIPSO) satellite’ provides PSC/aerosol information. CALIOP measure- 
ments began in April 2006. Trace gas profile measurements are from the 
Microwave Limb Sounder (MLS)* on NASA’s Aura satellite, and the predecessor 
MLS instrument” on the Upper Atmosphere Research Satellite (UARS). Total 
column ozone data are from the Dutch-Finnish Ozone Monitoring Instrument 
(OMI)* on board Aura. The historical total ozone record comprises data from 
Nimbus-7 and Earth Probe Total Ozone Mapping Spectrometer (TOMS)°*°. Aura 
MLS and OMI measurements are available from August 2004 through to the 
present. UARS MLS measurements were obtained from September 1992 through 
to early 2000, with increasingly sparse sampling in the later years**. TOMS data 
are available beginning in 1979, but no TOMS instrument was taking measure- 
ments during the 1995-96 Arctic winter. 

Measurements from the Match network of balloon-borne ozone soundings 
(ozone sondes)*’ are used in some of the chemical ozone loss estimates. 
Temperature and vortex analysis. Potential vorticity (PV) is used to define 
the vortex, with a contour of ‘scaled’ PV of 1.4 107s! (in vorticity units) 
demarking the vortex edge*’**. Vortex strength is diagnosed as the maximum 
daily gradient in PV as a function of equivalent latitude (the latitude that would 
enclose the same area between it and the pole as a given PV contour)"'~. Scaled 
PV multiplied by 10° is used in the calculation, resulting in units for its gradient of 
10° 4(s degrees equivalent latitude) !. 

The temperature threshold for chlorine activation, T,<1, is estimated using the 
formula for nitric acid trihydrate formation”, which depends on pressure, HNO; 
and H,O. Climatological HNO; and H,0 profiles are used, derived from UARS 
data. The area with T < T,,, is calculated on seven isentropic surfaces in the lower 
stratosphere: 390, 410, 430, 460, 490, 520 and 550 K; T,< on these levels is 197.5, 
197.2, 196.8, 196.5, 195.9, 195.3 and 194.5 K, respectively. To get the volume with 
T < Tact from 380 through 565 K, the areas at each of the seven levels are mul- 
tiplied by the estimated altitude associated with that layer and summed. The 
altitude range associated with each layer is obtained from a standard potential 
temperature profile as a function of altitude derived from high latitude temper- 
ature soundings taken during the 1988-89 through to 2001-02 winters (the same 
profile was used for V,,,- calculations in refs 13, 16 and 48). These thicknesses are 
1.29088, 1.19995, 1.36770, 1.46281, 1.30554, 1.18199 and 1.07382km for the 
seven levels listed above. Vortex volume is calculated from vortex area in the 
same manner. Winter mean V).c is calculated over 16 December through to 15 
April. Previous studies have shown that V,,,. scaled by the vortex area is a good 
proxy for chlorine activation and ozone loss potential'’. Additional temperature 
and vortex diagnostics are described in Supplementary Information. 

Polar stratospheric cloud and aerosol information. Particulate backscatter 
averaged over the polar vortex derived from CALIOP data is used to provide 
PSC/aerosol information. Total attenuated backscatter at 532 nm, b(z), is one of 
the basic CALIOP Level 1B data products. b(z) is the sum of the particulate 
backscatter (due to liquid aerosol and PSCs), b,(z), and molecular backscatter, 
by(Z). Bm (Z) is calculated using GEOS-5 molecular density profiles (included in 
the CALIOP Level 1B data files) and a theoretical value for the molecular scatter- 
ing cross-section”’. Profiles of b,(z) are then produced by subtracting b,,(z) from 
b(z). Vortex-averaged profiles of b,(z) are produced by averaging all CALIOP 
b,(z) profiles located inside the vortex edge (defined using information available 
in GEOS-5 Derived Meteorological Product (DMP) files for the nearly-coincident 
Aura MLS data°*) over the selected time interval. 

MLS trace gas profile measurements and analysis. Trace gas profile measure- 
ments of HNO3, HCl, ClO, ozone and N2O (a long-lived tracer used to assess 
descent) are from Aura MLS* version 3 retrievals; data quality screening is as 
recommended in the MLS data quality document*®. MLS data are retrieved on 
pressure surfaces; potential temperature as a function of pressure from MLS 
DMPs” calculated from GEOS-5 analyses is used to interpolate to isentropic 
surfaces. Vortex averages of MLS data are calculated using the 1.410 *s * 
scaled PV contour to define the vortex edge, using PV values from the MLS 
DMPs”. Active chlorine is in the form of ClO mainly during the daytime, and 
thus measured ClO amounts vary with the solar zenith angle (SZA) at which the 
measurements are taken. Only daytime ClO measurements are used here. 
Northern high latitudes are sampled near midday local time, southern high 
latitudes are sampled in late afternoon, thus the SZA of Aura MLS Antarctic 
measurements is ~7° higher on average than that in the Arctic. Reactive chlorine 
partitioning shifts away from ClO at higher SZAs”", leading to ~30% lower ClO 
measured by Aura MLS in the Antarctic than in the Arctic under fully activated 


conditions. MLS measurements are unavailable from 27 March through to 
20 April 2011 because of an instrument anomaly. Upper Atmosphere Research 
Satellite (UARS) MLS measurements, used for analysis of 1995-96 and 1996-97, 
are sparse because of the UARS yaw cycle and other measurement gaps’. The 
time of day of UARS measurements varied through the yaw cycle, in the middle of 
which no daytime ClO measurements were obtained"; thus ClO values shown in 
1995-96 and 1996-97 near those dates (including the mid-February 1996 mea- 
surements shown in Fig. 2g) are not representative of the degree of chlorine 
activation. 

Chemical loss calculations. Chemical ozone loss is quantified by two methods, 
both widely used for such calculations”'?***““8, In the ‘passive subtraction’ 
method’”’, a transport model is used to calculate the evolution of ozone in 
the absence of chemical changes (‘passive’ ozone). The difference between passive 
ozone and observed ozone provides an estimate of chemical loss. 

Here, passive ozone is obtained in three different ways. First, MLS observations 
of N,O, a long-lived species unaffected by chemical processes, are used to cal- 
culate vertical motion, and that estimate of descent is then used to calculate how 
initial MLS ozone profiles would have evolved in the absence of chemical loss". 
Second, a ‘reverse trajectory’ transport model**”® is used to transport an initial 
state based on MLS-observed ozone with no chemistry. Finally, the ATLAS 
(Alfred Wegener Institute Lagrangian Chemistry/Transport System) chemistry 
and transport model is run in passive mode”, initialized with MLS ozone. 

Vortex ozone is also examined in relation to the surfaces on which it is sub- 
siding’*"*”**, The descent rates used here are obtained by averaging radiative 
heating/cooling rates from the radiation calculation used in the ATLAS model 
over the polar vortex“*. These rates are then used to examine vortex-averaged 
MLS and ozone-sonde data on surfaces of ‘spring equivalent potential temper- 
ature™*, defined as the potential temperature at which air originating at a given 
level arrived at the end of March. Since the air descended on these surfaces, ozone 
would have been constant on each such surface in the absence of chemical loss. 

The ozone-sonde data used here are all from electrochemical concentration cell 
(ECC) sondes, made by different manufacturers. Ozone-sonde data quality was 
assessed in an intercomparison experiment” and is discussed in ref. 47. For 
chemical loss calculations using ozone-sonde data, the profiles are first examined 
using a procedure for detecting lamination in the profiles; such lamination (an 
example is shown in Fig. 3f) is associated with mixing in of extra-vortex air, which 
may obscure the signature of chemical loss. Profiles that have been significantly 
altered by mixing processes, as indicated by lamination, are excluded from the 
vortex averages used in the chemical loss calculations. 2010-11 Arctic ozone- 
sonde data are provided as Supplementary Information. 

Results from the ATLAS model passive subtraction calculations, and from the 
calculations on spring equivalent potential temperature surfaces using the Match 
network ozone-sonde data, are shown in Fig. 4; all panels show vortex averages. 
These results have been compared with the results from the other methods 
described above. While absolute ozone values obtained from different methods/ 
data sets vary significantly (up to ~0.4 p.p.m.v. at the end of March 2011), the 
year-to-year variations in chemical loss calculated using all three methods agree 
closely, indicating a high degree of precision in the relative amount of calculated 
loss between different years. 

The Alfred Wegener Institute chemical box model, also used as the chemical 
module in ATLAS, simulates 175 reactions between 48 chemical species in the 
stratosphere*”** This model was used to perform conceptual runs (Supplemen- 
tary Fig. 4), started on 1 March with identical initial mixing ratios of all species 
except HNO, and O3. For these two species values corresponding to 1997 
(3 p.p.m.v. Os, 10 p.p.b.v. HNO) and 2011 (2.2 p.p.m.v. O3, 6 p.p.b.v. HNO;) 
(compare Figs 2a and 4c) were combined to yield four sets of initial conditions. 
Initial ClO, was 2 p.p.b.v., corresponding to the vortex-averaged ClO, derived by 
ATLAS from MLS ClO measurements on 1 March 2011. An air parcel at 70° N, 
460 K potential temperature, with a temperature of 193 K throughout March, was 
used. Heterogeneous reactions took place on liquid aerosols, rather than solid 
(nitric acid trihydrate, NAT) PSCs, since the widespread existence of the latter is 
inconsistent with MLS observations of gas-phase HNO; values (Fig. 2a) larger 
than those the microphysical module predicts if NAT is present. A sensitivity 
run showed that sporadically occurring solid PSCs did not change the results 
significantly. 

Column ozone and ozone deficit calculation. OMI total ozone data were 
processed with version 8.5 of the TOMS algorithm and have been extensively 
validated. TOMS data were processed with version 8 of the algorithm. The OMI 
and TOMS total ozone data used in this study were averaged on a fixed global 
1° X 1° latitude X longitude grid. Averages were computed by area-weighting 
observations based on the overlap of their instantaneous field-of-view with each 
grid cell. Only data that satisfy quality criteria based on measurement path length 
and algorithm diagnostic criteria were included in the averaged samples. 
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Individual total ozone retrievals included in the samples are expected to have a 
root-mean-squared error of 1-2%. 

Total ozone ‘deficit’ is calculated as the difference between daily values and a 
reference that is minimally affected by chemical ozone loss. The reference for the 
Arctic is the daily mean over all Arctic winters from 1978-79 through to 2009-10, 
from OMI starting in 2004-05 and from TOMS for earlier years”. The Antarctic 
reference state is the daily mean of TOMS measurements for 1979 through to 
1981. Because the Antarctic reference state is based on only three years’ data for 
each day, variations in vortex position are not effectively averaged out; this 
reference is thus less robust than that for the Arctic, so patterns in daily maps 
may partially reflect differences in vortex position between the reference and the 
focus day. 
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